Enhance README with new scientific integrations, updated database entries, and improved descriptions across various sections, including bioinformatics, cheminformatics, and machine learning.

This commit is contained in:
Timothy Kassis
2025-10-20 21:31:48 -07:00
parent 2273130f5f
commit ac70437180

View File

@@ -131,110 +131,96 @@ After installing the plugin, you can use the skill by just mentioning it. Additi
## TODO: Future Scientific Capabilities ## TODO: Future Scientific Capabilities
### Scientific Integrations
- **LabArchives** - Electronic lab notebook (ELN) integration for research documentation, protocol management, and collaboration
- **Dotmatics** - Scientific informatics platform integration for data management, inventory, and workflow automation
- **Thermo Fisher Connect** - Integration with Thermo Fisher cloud platform for instrument data, LIMS, and analytics workflows
- **PerkinElmer Signals** - Scientific data management and ELN platform integration
- **CDD Vault** - Collaborative Drug Discovery platform integration for chemical registration and bioassay data
- **Geneious** - Molecular biology and NGS analysis software integration
- **SnapGene** - Molecular cloning and DNA visualization platform integration
- **GraphPad Prism** - Statistics and graphing software integration for publication-quality analysis
- **Synthego ICE** - CRISPR editing analysis platform integration
- **OpenTrons** - Laboratory automation platform integration for liquid handling protocols
- **TeselaGen** - Synthetic biology design and automation platform integration
- **Strateos** - Cloud laboratory automation platform integration
- **Jupyter Hub/Lab** - Multi-user scientific computing environment integration
- **Weights & Biases** - Experiment tracking and ML model monitoring integration
- **MLflow** - ML lifecycle management platform integration
- **DVC (Data Version Control)** - Data and ML model versioning integration
- **Omero** - Bio-image data management platform integration
- **Galaxy** - Web-based bioinformatics workflow platform integration
- **Nextflow/nf-core** - Workflow management system integration for reproducible pipelines
- **Seven Bridges** - Genomics analysis platform and workspace integration
- **DNAnexus** - Cloud-based genome sequencing analysis platform integration
- **BaseSpace** - Illumina genomics data analysis and management platform integration
### Scientific Databases ### Scientific Databases
- **ArrayExpress** - EMBL-EBI gene expression database with functional genomics experiments
- **BioGRID** - Biological General Repository for Interaction Datasets (protein, genetic, and chemical interactions) - **BioGRID** - Biological General Repository for Interaction Datasets (protein, genetic, and chemical interactions)
- **DAVID** - Database for Annotation, Visualization and Integrated Discovery for functional enrichment analysis
- **dbSNP** - NCBI's database of single nucleotide polymorphisms and short genetic variations - **dbSNP** - NCBI's database of single nucleotide polymorphisms and short genetic variations
- **GenBank** - NIH genetic sequence database (part of NCBI but with specific access patterns)
- **InterPro** - Protein sequence analysis and classification with functional annotations - **InterPro** - Protein sequence analysis and classification with functional annotations
- **MetaboLights** - EMBL-EBI metabolomics database with experimental data and metadata
- **OMIM** - Online Mendelian Inheritance in Man for genetic disorders and genes - **OMIM** - Online Mendelian Inheritance in Man for genetic disorders and genes
- **Pfam** - Protein families database with multiple sequence alignments and HMMs - **Pfam** - Protein families database with multiple sequence alignments and HMMs
- **RefSeq** - NCBI's non-redundant reference sequence database - **RefSeq** - NCBI's non-redundant reference sequence database
- **UCSC Genome Browser** - Genomic data visualization and custom track integration - **UCSC Genome Browser** - Genomic data visualization and custom track integration
- **WikiPathways** - Community-curated biological pathway database - **WikiPathways** - Community-curated biological pathway database
- **MetaboLights** - EMBL-EBI metabolomics database with experimental data and metadata
### Bioinformatics & Genomics ### Bioinformatics & Genomics
- **pybedtools** - Wrapper for BEDTools genome arithmetic operations - **pybedtools** - Wrapper for BEDTools genome arithmetic operations
- **mygene** - Python client for MyGene.Info gene query service - **mygene** - Python client for MyGene.Info gene query service
- **pyensembl** - Python interface to Ensembl reference genome metadata
- **nglview** - IPython/Jupyter widget for molecular visualization - **nglview** - IPython/Jupyter widget for molecular visualization
- **pyvcf** - Variant Call Format (VCF) file parser
- **pyfaidx** - Efficient FASTA file indexing and retrieval - **pyfaidx** - Efficient FASTA file indexing and retrieval
- **kipoiseq** - Genomic sequence data loading for ML models
- **genomepy** - Download and manage genome assemblies
- **MACS2/3** - Peak calling for ChIP-seq data - **MACS2/3** - Peak calling for ChIP-seq data
### Cheminformatics & Drug Discovery ### Cheminformatics & Drug Discovery
- **Open Babel** - Chemical file format conversion and molecular mechanics - **Open Babel** - Chemical file format conversion and molecular mechanics
- **ChemPy** - Chemistry and thermodynamics calculations
- **Psi4** - Quantum chemistry software for ab initio calculations - **Psi4** - Quantum chemistry software for ab initio calculations
- **pmapper** - Pharmacophore modeling and fingerprinting
- **ODDT** - Open Drug Discovery Toolkit for structure-based drug design
- **ProLIF** - Protein-ligand interaction fingerprints
- **Mordred** - Molecular descriptor calculator (1800+ descriptors)
- **ProteinMPNN** - Deep learning for protein sequence design - **ProteinMPNN** - Deep learning for protein sequence design
- **ESM** - Evolutionary Scale Modeling for protein language models - **ESM (Evolutionary Scale Modeling)** - Protein language models for structure and function prediction
- **OpenMM** - Molecular dynamics simulation toolkit - **OpenMM** - Molecular dynamics simulation toolkit
### Proteomics & Mass Spectrometry ### Proteomics & Mass Spectrometry
- **pyteomics** - Mass spectrometry data analysis - **pyteomics** - Mass spectrometry data analysis and peptide/protein identification
- **MSstats** - Statistical analysis of quantitative proteomics
### Systems Biology & Networks ### Systems Biology & Networks
- **NetworkX** - Complex network analysis and graph algorithms - **NetworkX** - Complex network analysis and graph algorithms
- **igraph** - Fast network analysis library - **igraph** - Fast network analysis library with efficient algorithms
- **PyBioNetFit** - Biological network modeling and fitting
- **PINT** - Pathway integration analysis
- **GEMEditor** - Graphical tool for genome-scale metabolic models
### Structural Biology ### Structural Biology
- **MDAnalysis** - Molecular dynamics trajectory analysis - **MDAnalysis** - Molecular dynamics trajectory analysis
- **ProDy** - Protein dynamics and structure analysis - **ProDy** - Protein dynamics and structure analysis
- **PyMOL** - Molecular visualization scripting - **PyMOL** - Molecular visualization scripting
- **Chimera/ChimeraX** - UCSF molecular visualization
- **FreeSASA** - Solvent accessible surface area calculations
- **DSSP** - Secondary structure assignment
### Machine Learning for Science ### Machine Learning for Science
- **DGL-LifeSci** - Deep Graph Library for life sciences - **DGL-LifeSci** - Deep Graph Library for life sciences
- **ChemBERTa** - Transformer models for chemistry - **ChemBERTa** - Transformer models for chemistry
- **TorchDrug** - PyTorch library for drug discovery - **TorchDrug** - PyTorch library for drug discovery
- **GraNNField** - Graph neural networks for force fields
- **SchNet/DimeNet** - Continuous-filter convolutional networks for molecules - **SchNet/DimeNet** - Continuous-filter convolutional networks for molecules
- **MoleculeNet** - Benchmark datasets for molecular machine learning
- **TorchMD** - Molecular dynamics with PyTorch
- **jax-md** - Differentiable molecular dynamics in JAX
### Imaging & Microscopy ### Imaging & Microscopy
- **scikit-image** - Image processing algorithms - **scikit-image** - Image processing algorithms
- **CellProfiler** - Cell image analysis
- **Napari** - Multi-dimensional image viewer - **Napari** - Multi-dimensional image viewer
- **Fiji/ImageJ** - Image processing scripting - **CellProfiler** - Cell image analysis
- **StarDist** - Cell/nucleus detection with deep learning
- **Cellpose** - Generalist cell segmentation - **Cellpose** - Generalist cell segmentation
- **StarDist** - Cell/nucleus detection with deep learning
### Phylogenetics & Evolution ### Phylogenetics & Evolution
- **DendroPy** - Phylogenetic computing library - **DendroPy** - Phylogenetic computing library
- **PyCogent** - Comparative genomics toolkit
- **TreeTime** - Phylodynamic analysis and molecular clock inference
### Metabolomics
- **PyCytoData** - Cytometry data processing
- **MS-DIAL** - Data-independent MS/MS deconvolution
- **XCMS** - LC/MS and GC/MS data processing
### Climate & Environmental Science ### Climate & Environmental Science
- **xarray** - N-dimensional labeled arrays and datasets - **xarray** - N-dimensional labeled arrays and datasets for scientific computing
- **Iris** - Climate and weather data analysis
- **MetPy** - Meteorological data toolkit
- **climlab** - Climate modeling and analysis
### Statistics & Experimental Design ### Statistics & Experimental Design
- **statsmodels** - Statistical models and hypothesis testing - **pingouin** - Statistical tests with clear output and effect sizes
- **pingouin** - Statistical tests with clear output
- **PyDOE2** - Design of experiments
- **scipy.stats** - Statistical functions and distributions - **scipy.stats** - Statistical functions and distributions
### Data Management & Processing ### Data Management & Processing
- **DuckDB** - Analytical SQL database for in-process analytics
- **Parquet** - Columnar storage format for big data - **Parquet** - Columnar storage format for big data
- **DuckDB** - Analytical SQL database
- **SQLAlchemy** - SQL toolkit and ORM
### Visualization ### Visualization
- **Plotly** - Interactive graphing library - **Plotly** - Interactive graphing library for web-based visualizations
- **Bokeh** - Interactive visualization for web browsers
- **Altair** - Declarative statistical visualization - **Altair** - Declarative statistical visualization
- **PyVista** - 3D plotting and mesh analysis - **PyVista** - 3D plotting and mesh analysis