2025-10-19 16:12:21 -07:00
2025-10-19 16:12:21 -07:00
2025-10-19 16:12:21 -07:00
2025-10-19 14:28:39 -07:00
2025-10-19 15:29:11 -07:00
2025-10-19 14:43:17 -07:00
2025-10-19 14:01:29 -07:00
2025-10-19 16:12:21 -07:00

Claude Scientific Skills

A comprehensive collection of ready-to-use scientific skills for Claude, curated by the K-Dense team. These skills enable Claude to work with specialized scientific libraries and databases across bioinformatics, cheminformatics, machine learning, materials science, and data analysis. Using these set of skills with Claude Code allows you to create an 'AI Scientist' on your desktop! If you want substantially more advanced capabilties, compute infrastructure and enterprise ready offering check out https://k-dense.ai/.

Available Skills

Scientific Databases

  • ChEMBL - Bioactive molecule database with drug-like properties (2M+ compounds, 19M+ activities, 13K+ targets)
  • PubChem - Access chemical compound data from the world's largest free chemical database (110M+ compounds, 270M+ bioactivities)
  • PubMed - Access to PubMed literature database with advanced search capabilities.

Scientific Packages

Bioinformatics & Genomics:

  • AnnData - Annotated data matrices for single-cell genomics and h5ad files
  • Arboreto - Gene regulatory network inference using GRNBoost2 and GENIE3
  • BioPython - Sequence manipulation, NCBI database access, BLAST searches, alignments, and phylogenetics
  • BioServices - Programmatic access to 40+ biological web services (KEGG, UniProt, ChEBI, ChEMBL)
  • Cellxgene Census - Query and analyze large-scale single-cell RNA-seq data
  • gget - Efficient genomic database queries (Ensembl, UniProt, NCBI, PDB, COSMIC)
  • PyDESeq2 - Differential gene expression analysis for bulk RNA-seq data
  • Scanpy - Single-cell RNA-seq analysis with clustering, marker genes, and UMAP/t-SNE visualization

Cheminformatics & Drug Discovery:

  • Datamol - Molecular manipulation and featurization with enhanced RDKit workflows
  • DeepChem - Molecular machine learning, graph neural networks, and MoleculeNet benchmarks
  • DiffDock - Diffusion-based molecular docking for protein-ligand binding prediction
  • MedChem - Medicinal chemistry analysis, ADMET prediction, and drug-likeness assessment
  • Molfeat - 100+ molecular featurizers including fingerprints, descriptors, and pretrained models
  • PyTDC - Therapeutics Data Commons for drug discovery datasets and benchmarks
  • RDKit - Cheminformatics toolkit for molecular I/O, descriptors, fingerprints, and SMARTS

Machine Learning & Deep Learning:

  • PyMC - Bayesian statistical modeling and probabilistic programming
  • PyMOO - Multi-objective optimization with evolutionary algorithms
  • PyTorch Lightning - Structured PyTorch training with automatic optimization
  • scikit-learn - Machine learning algorithms, preprocessing, and model selection
  • Torch Geometric - Graph Neural Networks for molecular and geometric data
  • Transformers - Hugging Face transformers for NLU, image classification, and generation
  • UMAP-learn - Dimensionality reduction and manifold learning

Materials Science & Chemistry:

  • Astropy - Astronomy and astrophysics (coordinates, cosmology, FITS files)
  • COBRApy - Constraint-based metabolic modeling and flux balance analysis
  • Pymatgen - Materials structure analysis, phase diagrams, and electronic structure

Data Analysis & Visualization:

  • Matplotlib - Publication-quality plotting and visualization
  • Polars - High-performance DataFrame operations with lazy evaluation
  • Seaborn - Statistical data visualization
  • ReportLab - Programmatic PDF generation for reports and documents

Phylogenetics & Trees:

  • ETE Toolkit - Phylogenetic tree manipulation, visualization, and analysis

Genomics Tools:

  • deepTools - NGS data analysis (ChIP-seq, RNA-seq, ATAC-seq) with BAM/bigWig files
  • FlowIO - Flow Cytometry Standard (FCS) file reading and manipulation
  • scikit-bio - Bioinformatics sequence analysis and diversity metrics
  • Zarr - Chunked, compressed N-dimensional array storage

Multi-omics & Integration:

  • BIOMNI - Multi-omics data integration with LLM-powered analysis

Scientific Thinking & Analysis

  • Hypothesis Generation - Structured frameworks for generating and evaluating scientific hypotheses
  • Scientific Critical Thinking - Tools and approaches for rigorous scientific reasoning and evaluation
  • Scientific Visualization - Best practices and templates for creating publication-quality scientific figures
  • Statistical Analysis - Comprehensive statistical testing, power analysis, and experimental design

Try in Claude Code, Claude.ai, and the API

Claude Code

You can register this repository as a Claude Code Plugin marketplace by running the following command in Claude Code:

/plugin marketplace add K-Dense-AI/claude-scientific-skills

Then, to install a specific set of skills:

  1. Select Browse and install plugins
  2. Select claude-scientific-skills
  3. Select scientific-databases or scientific-packages
  4. Select Install now

After installing the plugin, you can use the skill by just mentioning it. Additionally, in most case, Claude Code will figure out what to use based on the task.

Claude.ai

These example skills are all already available to paid plans in Claude.ai.

To use any skill from this repository or upload custom skills, follow the instructions in Using skills in Claude.

Claude API

You can use Anthropic's pre-built skills, and upload custom skills, via the Claude API. See the Skills API Quickstart for more.

TODO: Future Scientific Capabilities (Availble currently in K-Dense)

Scientific Databases

  • UniProt - Protein sequence and functional information database
  • KEGG - Kyoto Encyclopedia of Genes and Genomes for pathways and metabolism
  • NCBI Gene - Gene-specific information from RefSeq, GenBank, and other sources
  • Protein Data Bank (PDB) - 3D structural data of biological macromolecules
  • COSMIC - Catalogue of Somatic Mutations in Cancer
  • ClinVar - Clinical significance of genomic variants
  • AlphaFold DB - Protein structure predictions from DeepMind
  • STRING - Protein-protein interaction networks
  • GEO (Gene Expression Omnibus) - Functional genomics data repository
  • European Nucleotide Archive (ENA) - Comprehensive nucleotide sequence database
  • ZINC - Free database of commercially available compounds for virtual screening

Bioinformatics & Genomics

  • pysam - Interface to SAM/BAM/CRAM format files
  • pybedtools - Wrapper for BEDTools genome arithmetic operations
  • mygene - Python client for MyGene.Info gene query service
  • pyensembl - Python interface to Ensembl reference genome metadata
  • nglview - IPython/Jupyter widget for molecular visualization
  • pyvcf - Variant Call Format (VCF) file parser
  • pyfaidx - Efficient FASTA file indexing and retrieval
  • kipoiseq - Genomic sequence data loading for ML models
  • genomepy - Download and manage genome assemblies
  • MACS2/3 - Peak calling for ChIP-seq data

Cheminformatics & Drug Discovery

  • Open Babel - Chemical file format conversion and molecular mechanics
  • ChemPy - Chemistry and thermodynamics calculations
  • Psi4 - Quantum chemistry software for ab initio calculations
  • pmapper - Pharmacophore modeling and fingerprinting
  • ODDT - Open Drug Discovery Toolkit for structure-based drug design
  • ProLIF - Protein-ligand interaction fingerprints
  • Mordred - Molecular descriptor calculator (1800+ descriptors)
  • ProteinMPNN - Deep learning for protein sequence design
  • ESM - Evolutionary Scale Modeling for protein language models
  • OpenMM - Molecular dynamics simulation toolkit

Proteomics & Mass Spectrometry

  • pyteomics - Mass spectrometry data analysis
  • pyOpenMS - OpenMS Python bindings for proteomics
  • matchms - Processing and similarity matching of mass spectrometry data
  • MSstats - Statistical analysis of quantitative proteomics

Systems Biology & Networks

  • NetworkX - Complex network analysis and graph algorithms
  • igraph - Fast network analysis library
  • PyBioNetFit - Biological network modeling and fitting
  • PINT - Pathway integration analysis
  • GEMEditor - Graphical tool for genome-scale metabolic models

Structural Biology

  • MDAnalysis - Molecular dynamics trajectory analysis
  • ProDy - Protein dynamics and structure analysis
  • PyMOL - Molecular visualization scripting
  • Chimera/ChimeraX - UCSF molecular visualization
  • FreeSASA - Solvent accessible surface area calculations
  • DSSP - Secondary structure assignment

Machine Learning for Science

  • DGL-LifeSci - Deep Graph Library for life sciences
  • ChemBERTa - Transformer models for chemistry
  • TorchDrug - PyTorch library for drug discovery
  • GraNNField - Graph neural networks for force fields
  • SchNet/DimeNet - Continuous-filter convolutional networks for molecules
  • MoleculeNet - Benchmark datasets for molecular machine learning
  • TorchMD - Molecular dynamics with PyTorch
  • jax-md - Differentiable molecular dynamics in JAX

Imaging & Microscopy

  • scikit-image - Image processing algorithms
  • CellProfiler - Cell image analysis
  • Napari - Multi-dimensional image viewer
  • Fiji/ImageJ - Image processing scripting
  • StarDist - Cell/nucleus detection with deep learning
  • Cellpose - Generalist cell segmentation

Phylogenetics & Evolution

  • DendroPy - Phylogenetic computing library
  • PyCogent - Comparative genomics toolkit
  • TreeTime - Phylodynamic analysis and molecular clock inference

Metabolomics

  • PyCytoData - Cytometry data processing
  • MS-DIAL - Data-independent MS/MS deconvolution
  • XCMS - LC/MS and GC/MS data processing

Climate & Environmental Science

  • xarray - N-dimensional labeled arrays and datasets
  • Iris - Climate and weather data analysis
  • MetPy - Meteorological data toolkit
  • climlab - Climate modeling and analysis

Statistics & Experimental Design

  • statsmodels - Statistical models and hypothesis testing
  • pingouin - Statistical tests with clear output
  • PyDOE2 - Design of experiments
  • scipy.stats - Statistical functions and distributions

Data Management & Processing

  • Dask - Parallel computing for analytics
  • Parquet - Columnar storage format for big data
  • DuckDB - Analytical SQL database
  • SQLAlchemy - SQL toolkit and ORM

Visualization

  • Plotly - Interactive graphing library
  • Bokeh - Interactive visualization for web browsers
  • Altair - Declarative statistical visualization
  • PyVista - 3D plotting and mesh analysis
Languages
Python 76.3%
TeX 21.5%
JavaScript 1.5%
Shell 0.5%
HTML 0.2%