From 1871693348f93daf61162b036d6db6ef92c062f7 Mon Sep 17 00:00:00 2001
From: Timothy Kassis <timothy.kassis@biostate.ai>
Date: Fri, 24 Oct 2025 09:30:56 -0700
Subject: [PATCH] Enhance README.md with detailed instructions for various
 workflows, emphasizing the importance of organized output and the creation of
 comprehensive documentation and visualizations.

---
 README.md | 42 ++++++++++++++++++++++++++++++------------
 1 file changed, 30 insertions(+), 12 deletions(-)

diff --git a/README.md b/README.md
index 348f230..d363bfb 100644
--- a/README.md
+++ b/README.md
@@ -111,19 +111,22 @@ Once you've installed the skills, you can ask Claude to execute complex multi-st
 
 ### End-to-End Drug Discovery Pipeline
 ```
-"Always use available 'skills' when possible
+"Always use available 'skills' when possible. Keep the output organized.
 
 I need to find novel EGFR inhibitors for lung cancer treatment. Query ChEMBL for existing 
 EGFR inhibitors with IC50 < 50nM, analyze their structure-activity relationships using RDKit, 
 generate similar molecules with improved properties using datamol, perform virtual screening 
 with DiffDock against the AlphaFold-predicted EGFR structure, and search PubMed for recent 
 papers on resistance mechanisms to prioritize scaffolds. Finally, check COSMIC for common 
-EGFR mutations and assess how our candidates might interact with mutant forms."
+EGFR mutations and assess how our candidates might interact with mutant forms. 
+Create useful visualizations in the form of scientific figures as you go (if needed).
+When done, create a comprehensive README.md and a well formatted pdf summarizing the methodology, 
+results, conclusions and providing recommendations."
 ```
 
 ### Comprehensive Single-Cell Analysis Workflow
 ```
-"Always use available 'skills' when possible
+"Always use available 'skills' when possible. Keep the output organized.
 
 Load this 10X Genomics dataset using Scanpy, perform quality control and doublet removal, 
 integrate with public data from Cellxgene Census for the same tissue type, identify cell 
@@ -131,24 +134,30 @@ populations using known markers from NCBI Gene, perform differential expression
 with PyDESeq2, run gene regulatory network inference with Arboreto, query Reactome and 
 KEGG for pathway enrichment, and create publication-quality visualizations with matplotlib. 
 Then cross-reference top dysregulated genes with Open Targets to identify potential 
-therapeutic targets."
+therapeutic targets.
+Create useful visualizations in the form of scientific figures as you go (if needed).
+When done, create a comprehensive README.md and a well formatted pdf summarizing the methodology, 
+results, conclusions and providing recommendations."
 ```
 
 ### Multi-Omics Integration for Biomarker Discovery
 ```
-"Always use available 'skills' when possible
+"Always use available 'skills' when possible. Keep the output organized.
 
 I have RNA-seq, proteomics, and metabolomics data from cancer patients. Use PyDESeq2 for 
 differential expression, pyOpenMS to analyze mass spec data, and integrate metabolite 
 information from HMDB and Metabolomics Workbench. Map proteins to pathways using UniProt 
 and KEGG, identify protein-protein interactions via STRING, correlate multi-omics layers 
 using statsmodels, and build a machine learning model with scikit-learn to predict patient 
-outcomes. Search ClinicalTrials.gov for ongoing trials targeting the top candidates."
+outcomes. Search ClinicalTrials.gov for ongoing trials targeting the top candidates.
+Create useful visualizations in the form of scientific figures as you go (if needed).
+When done, create a comprehensive README.md and a well formatted pdf summarizing the methodology, 
+results, conclusions and providing recommendations."
 ```
 
 ### Structure-Based Virtual Screening Campaign
 ```
-"Always use available 'skills' when possible
+"Always use available 'skills' when possible. Keep the output organized.
 
 I want to discover allosteric modulators for a protein-protein interaction. Retrieve the 
 AlphaFold structure for both proteins, identify the interaction interface using BioPython, 
@@ -156,12 +165,15 @@ search ZINC15 for molecules with suitable properties for allosteric binding (MW
 logP 2-4), filter for drug-likeness using RDKit, perform molecular docking with DiffDock 
 to identify potential allosteric sites, rank candidates using DeepChem's property prediction 
 models, check PubChem for suppliers, and search USPTO patents to assess freedom to operate. 
-Finally, generate analogs with MedChem and molfeat for lead optimization."
+Finally, generate analogs with MedChem and molfeat for lead optimization.
+Create useful visualizations in the form of scientific figures as you go (if needed).
+When done, create a comprehensive README.md and a well formatted pdf summarizing the methodology, 
+results, conclusions and providing recommendations."
 ```
 
 ### Clinical Genomics Variant Interpretation Pipeline
 ```
-"Always use available 'skills' when possible
+"Always use available 'skills' when possible. Keep the output organized.
 
 Analyze this VCF file from a patient with suspected hereditary cancer. Use pysam to parse 
 variants, annotate with Ensembl for functional consequences, query ClinVar for known 
@@ -169,12 +181,15 @@ pathogenic variants, check COSMIC for somatic mutations in cancer, retrieve gene
 from NCBI Gene, analyze protein impact using UniProt, search PubMed for case reports of 
 similar variants, query ClinPGx for pharmacogenomic implications, and generate a clinical 
 report with ReportLab. Then search ClinicalTrials.gov for precision medicine trials matching 
-the patient's profile."
+the patient's profile.
+Create useful visualizations in the form of scientific figures as you go (if needed).
+When done, create a comprehensive README.md and a well formatted pdf summarizing the methodology, 
+results, conclusions and providing recommendations."
 ```
 
 ### Systems Biology Network Analysis
 ```
-"Always use available 'skills' when possible
+"Always use available 'skills' when possible. Keep the output organized.
 
 Starting with a list of differentially expressed genes from my RNA-seq experiment, query 
 NCBI Gene for detailed annotations, retrieve protein sequences from UniProt, identify 
@@ -182,7 +197,10 @@ protein-protein interactions using STRING, map to biological pathways in Reactom
 analyze network topology with Torch Geometric, identify hub genes and bottleneck proteins, 
 perform gene regulatory network reconstruction with Arboreto, integrate with Open Targets 
 for druggability assessment, use PyMC for Bayesian network modeling, and create interactive 
-network visualizations. Finally, search GEO for similar expression patterns across diseases."
+network visualizations. Finally, search GEO for similar expression patterns across diseases.
+Create useful visualizations in the form of scientific figures as you go (if needed).
+When done, create a comprehensive README.md and a well formatted pdf summarizing the methodology, 
+results, conclusions and providing recommendations."
 ```
 
 ---