mirror of
https://github.com/K-Dense-AI/claude-scientific-skills.git
synced 2026-01-26 16:58:56 +08:00
4.3 KiB
4.3 KiB
Scientific Packages
Bioinformatics & Genomics
- AnnData - Annotated data matrices for single-cell genomics and h5ad files
- Arboreto - Gene regulatory network inference using GRNBoost2 and GENIE3
- BioPython - Sequence manipulation, NCBI database access, BLAST searches, alignments, and phylogenetics
- BioServices - Programmatic access to 40+ biological web services (KEGG, UniProt, ChEBI, ChEMBL)
- Cellxgene Census - Query and analyze large-scale single-cell RNA-seq data
- gget - Efficient genomic database queries (Ensembl, UniProt, NCBI, PDB, COSMIC)
- pysam - Read, write, and manipulate genomic data files (SAM/BAM/CRAM alignments, VCF/BCF variants, FASTA/FASTQ sequences) with pileup analysis, coverage calculations, and bioinformatics workflows
- PyDESeq2 - Differential gene expression analysis for bulk RNA-seq data
- Scanpy - Single-cell RNA-seq analysis with clustering, marker genes, and UMAP/t-SNE visualization
Cheminformatics & Drug Discovery
- Datamol - Molecular manipulation and featurization with enhanced RDKit workflows
- DeepChem - Molecular machine learning, graph neural networks, and MoleculeNet benchmarks
- DiffDock - Diffusion-based molecular docking for protein-ligand binding prediction
- MedChem - Medicinal chemistry analysis, ADMET prediction, and drug-likeness assessment
- Molfeat - 100+ molecular featurizers including fingerprints, descriptors, and pretrained models
- PyTDC - Therapeutics Data Commons for drug discovery datasets and benchmarks
- RDKit - Cheminformatics toolkit for molecular I/O, descriptors, fingerprints, and SMARTS
Proteomics & Mass Spectrometry
- matchms - Processing and similarity matching of mass spectrometry data with 40+ filters, spectral library matching (Cosine, Modified Cosine, Neutral Losses), metadata harmonization, molecular fingerprint comparison, and support for multiple file formats (MGF, MSP, mzML, JSON)
- pyOpenMS - Comprehensive mass spectrometry data analysis for proteomics and metabolomics (LC-MS/MS processing, peptide identification, feature detection, quantification, chemical calculations, and integration with search engines like Comet, Mascot, MSGF+)
Machine Learning & Deep Learning
- PyMC - Bayesian statistical modeling and probabilistic programming
- PyMOO - Multi-objective optimization with evolutionary algorithms
- PyTorch Lightning - Structured PyTorch training with automatic optimization
- scikit-learn - Machine learning algorithms, preprocessing, and model selection
- statsmodels - Statistical modeling and econometrics (OLS, GLM, logit/probit, ARIMA, time series forecasting, hypothesis testing, diagnostics)
- Torch Geometric - Graph Neural Networks for molecular and geometric data
- Transformers - Hugging Face transformers for NLU, image classification, and generation
- UMAP-learn - Dimensionality reduction and manifold learning
Materials Science & Chemistry
- Astropy - Astronomy and astrophysics (coordinates, cosmology, FITS files)
- COBRApy - Constraint-based metabolic modeling and flux balance analysis
- Pymatgen - Materials structure analysis, phase diagrams, and electronic structure
Data Analysis & Visualization
- Dask - Parallel computing for larger-than-memory datasets with distributed DataFrames, Arrays, Bags, and Futures
- Matplotlib - Publication-quality plotting and visualization
- Polars - High-performance DataFrame operations with lazy evaluation
- Seaborn - Statistical data visualization with dataset-oriented interface, automatic confidence intervals, publication-quality themes, colorblind-safe palettes, and comprehensive support for exploratory analysis, distribution comparisons, correlation matrices, regression plots, and multi-panel figures
- ReportLab - Programmatic PDF generation for reports and documents
Phylogenetics & Trees
- ETE Toolkit - Phylogenetic tree manipulation, visualization, and analysis
Genomics Tools
- deepTools - NGS data analysis (ChIP-seq, RNA-seq, ATAC-seq) with BAM/bigWig files
- FlowIO - Flow Cytometry Standard (FCS) file reading and manipulation
- scikit-bio - Bioinformatics sequence analysis and diversity metrics
- Zarr - Chunked, compressed N-dimensional array storage
Multi-omics & Integration
- BIOMNI - Multi-omics data integration with LLM-powered analysis