mirror of
https://github.com/K-Dense-AI/claude-scientific-skills.git
synced 2026-01-26 16:58:56 +08:00
62 lines
4.3 KiB
Markdown
62 lines
4.3 KiB
Markdown
# Scientific Packages
|
|
|
|
## Bioinformatics & Genomics
|
|
- **AnnData** - Annotated data matrices for single-cell genomics and h5ad files
|
|
- **Arboreto** - Gene regulatory network inference using GRNBoost2 and GENIE3
|
|
- **BioPython** - Sequence manipulation, NCBI database access, BLAST searches, alignments, and phylogenetics
|
|
- **BioServices** - Programmatic access to 40+ biological web services (KEGG, UniProt, ChEBI, ChEMBL)
|
|
- **Cellxgene Census** - Query and analyze large-scale single-cell RNA-seq data
|
|
- **gget** - Efficient genomic database queries (Ensembl, UniProt, NCBI, PDB, COSMIC)
|
|
- **pysam** - Read, write, and manipulate genomic data files (SAM/BAM/CRAM alignments, VCF/BCF variants, FASTA/FASTQ sequences) with pileup analysis, coverage calculations, and bioinformatics workflows
|
|
- **PyDESeq2** - Differential gene expression analysis for bulk RNA-seq data
|
|
- **Scanpy** - Single-cell RNA-seq analysis with clustering, marker genes, and UMAP/t-SNE visualization
|
|
|
|
## Cheminformatics & Drug Discovery
|
|
- **Datamol** - Molecular manipulation and featurization with enhanced RDKit workflows
|
|
- **DeepChem** - Molecular machine learning, graph neural networks, and MoleculeNet benchmarks
|
|
- **DiffDock** - Diffusion-based molecular docking for protein-ligand binding prediction
|
|
- **MedChem** - Medicinal chemistry analysis, ADMET prediction, and drug-likeness assessment
|
|
- **Molfeat** - 100+ molecular featurizers including fingerprints, descriptors, and pretrained models
|
|
- **PyTDC** - Therapeutics Data Commons for drug discovery datasets and benchmarks
|
|
- **RDKit** - Cheminformatics toolkit for molecular I/O, descriptors, fingerprints, and SMARTS
|
|
|
|
## Proteomics & Mass Spectrometry
|
|
- **matchms** - Processing and similarity matching of mass spectrometry data with 40+ filters, spectral library matching (Cosine, Modified Cosine, Neutral Losses), metadata harmonization, molecular fingerprint comparison, and support for multiple file formats (MGF, MSP, mzML, JSON)
|
|
- **pyOpenMS** - Comprehensive mass spectrometry data analysis for proteomics and metabolomics (LC-MS/MS processing, peptide identification, feature detection, quantification, chemical calculations, and integration with search engines like Comet, Mascot, MSGF+)
|
|
|
|
## Machine Learning & Deep Learning
|
|
- **PyMC** - Bayesian statistical modeling and probabilistic programming
|
|
- **PyMOO** - Multi-objective optimization with evolutionary algorithms
|
|
- **PyTorch Lightning** - Structured PyTorch training with automatic optimization
|
|
- **scikit-learn** - Machine learning algorithms, preprocessing, and model selection
|
|
- **statsmodels** - Statistical modeling and econometrics (OLS, GLM, logit/probit, ARIMA, time series forecasting, hypothesis testing, diagnostics)
|
|
- **Torch Geometric** - Graph Neural Networks for molecular and geometric data
|
|
- **Transformers** - Hugging Face transformers for NLU, image classification, and generation
|
|
- **UMAP-learn** - Dimensionality reduction and manifold learning
|
|
|
|
## Materials Science & Chemistry
|
|
- **Astropy** - Astronomy and astrophysics (coordinates, cosmology, FITS files)
|
|
- **COBRApy** - Constraint-based metabolic modeling and flux balance analysis
|
|
- **Pymatgen** - Materials structure analysis, phase diagrams, and electronic structure
|
|
|
|
## Data Analysis & Visualization
|
|
- **Dask** - Parallel computing for larger-than-memory datasets with distributed DataFrames, Arrays, Bags, and Futures
|
|
- **Matplotlib** - Publication-quality plotting and visualization
|
|
- **Polars** - High-performance DataFrame operations with lazy evaluation
|
|
- **Seaborn** - Statistical data visualization with dataset-oriented interface, automatic confidence intervals, publication-quality themes, colorblind-safe palettes, and comprehensive support for exploratory analysis, distribution comparisons, correlation matrices, regression plots, and multi-panel figures
|
|
- **ReportLab** - Programmatic PDF generation for reports and documents
|
|
|
|
## Phylogenetics & Trees
|
|
- **ETE Toolkit** - Phylogenetic tree manipulation, visualization, and analysis
|
|
|
|
## Genomics Tools
|
|
- **deepTools** - NGS data analysis (ChIP-seq, RNA-seq, ATAC-seq) with BAM/bigWig files
|
|
- **FlowIO** - Flow Cytometry Standard (FCS) file reading and manipulation
|
|
- **scikit-bio** - Bioinformatics sequence analysis and diversity metrics
|
|
- **Zarr** - Chunked, compressed N-dimensional array storage
|
|
|
|
## Multi-omics & Integration
|
|
- **BIOMNI** - Multi-omics data integration with LLM-powered analysis
|
|
|
|
|