mirror of
https://github.com/K-Dense-AI/claude-scientific-skills.git
synced 2026-03-27 07:09:27 +08:00
Add scVelo RNA velocity analysis workflow and IQ-TREE reference documentation
- Introduced a comprehensive RNA velocity analysis pipeline using scVelo, including data loading, preprocessing, velocity estimation, and visualization. - Added a script for running RNA velocity analysis with customizable parameters and output options. - Created detailed documentation for IQ-TREE 2 phylogenetic inference, covering command syntax, model selection, bootstrapping methods, and output interpretation. - Included references for velocity models and their mathematical framework, along with a comparison of different models. - Enhanced the scVelo skill documentation with installation instructions, use cases, and best practices for RNA velocity analysis.
This commit is contained in:
332
scientific-skills/bindingdb-database/SKILL.md
Normal file
332
scientific-skills/bindingdb-database/SKILL.md
Normal file
@@ -0,0 +1,332 @@
|
||||
---
|
||||
name: bindingdb-database
|
||||
description: Query BindingDB for measured drug-target binding affinities (Ki, Kd, IC50, EC50). Search by target (UniProt ID), compound (SMILES/name), or pathogen. Essential for drug discovery, lead optimization, polypharmacology analysis, and structure-activity relationship (SAR) studies.
|
||||
license: CC-BY-3.0
|
||||
metadata:
|
||||
skill-author: Kuan-lin Huang
|
||||
---
|
||||
|
||||
# BindingDB Database
|
||||
|
||||
## Overview
|
||||
|
||||
BindingDB (https://www.bindingdb.org/) is the primary public database of measured drug-protein binding affinities. It contains over 3 million binding data records for ~1.4 million compounds tested against ~9,200 protein targets, curated from scientific literature and patent literature. BindingDB stores quantitative binding measurements (Ki, Kd, IC50, EC50) essential for drug discovery, pharmacology, and computational chemistry research.
|
||||
|
||||
**Key resources:**
|
||||
- BindingDB website: https://www.bindingdb.org/
|
||||
- REST API: https://www.bindingdb.org/axis2/services/BDBService
|
||||
- Downloads: https://www.bindingdb.org/bind/chemsearch/marvin/Download.jsp
|
||||
- GitHub: https://github.com/drugilsberg/bindingdb
|
||||
|
||||
## When to Use This Skill
|
||||
|
||||
Use BindingDB when:
|
||||
|
||||
- **Target-based drug discovery**: What known compounds bind to a target protein? What are their affinities?
|
||||
- **SAR analysis**: How do structural modifications affect binding affinity for a series of analogs?
|
||||
- **Lead compound profiling**: What targets does a compound bind (selectivity/polypharmacology)?
|
||||
- **Benchmark datasets**: Obtain curated protein-ligand affinity data for ML model training
|
||||
- **Repurposing analysis**: Does an approved drug bind to an unintended target?
|
||||
- **Competitive analysis**: What is the best reported affinity for a target class?
|
||||
- **Fragment screening**: Find validated binding data for fragments against a target
|
||||
|
||||
## Core Capabilities
|
||||
|
||||
### 1. BindingDB REST API
|
||||
|
||||
Base URL: `https://www.bindingdb.org/axis2/services/BDBService`
|
||||
|
||||
```python
|
||||
import requests
|
||||
|
||||
BASE_URL = "https://www.bindingdb.org/axis2/services/BDBService"
|
||||
|
||||
def bindingdb_query(method, params):
|
||||
"""Query the BindingDB REST API."""
|
||||
url = f"{BASE_URL}/{method}"
|
||||
response = requests.get(url, params=params, headers={"Accept": "application/json"})
|
||||
response.raise_for_status()
|
||||
return response.json()
|
||||
```
|
||||
|
||||
### 2. Query by Target (UniProt ID)
|
||||
|
||||
```python
|
||||
def get_ligands_for_target(uniprot_id, affinity_type="Ki", cutoff=10000, unit="nM"):
|
||||
"""
|
||||
Get all ligands with measured affinity for a UniProt target.
|
||||
|
||||
Args:
|
||||
uniprot_id: UniProt accession (e.g., "P00519" for ABL1)
|
||||
affinity_type: "Ki", "Kd", "IC50", "EC50"
|
||||
cutoff: Maximum affinity value to return (in nM)
|
||||
unit: "nM" or "uM"
|
||||
"""
|
||||
params = {
|
||||
"uniprot_id": uniprot_id,
|
||||
"affinity_type": affinity_type,
|
||||
"affinity_cutoff": cutoff,
|
||||
"response": "json"
|
||||
}
|
||||
return bindingdb_query("getLigandsByUniprotID", params)
|
||||
|
||||
# Example: Get all compounds binding ABL1 (imatinib target)
|
||||
ligands = get_ligands_for_target("P00519", affinity_type="Ki", cutoff=100)
|
||||
```
|
||||
|
||||
### 3. Query by Compound Name or SMILES
|
||||
|
||||
```python
|
||||
def search_by_name(compound_name, limit=100):
|
||||
"""Search BindingDB for compounds by name."""
|
||||
params = {
|
||||
"compound_name": compound_name,
|
||||
"response": "json",
|
||||
"max_results": limit
|
||||
}
|
||||
return bindingdb_query("getAffinitiesByCompoundName", params)
|
||||
|
||||
def search_by_smiles(smiles, similarity=100, limit=50):
|
||||
"""
|
||||
Search BindingDB by SMILES string.
|
||||
|
||||
Args:
|
||||
smiles: SMILES string of the compound
|
||||
similarity: Tanimoto similarity threshold (1-100, 100 = exact)
|
||||
"""
|
||||
params = {
|
||||
"SMILES": smiles,
|
||||
"similarity": similarity,
|
||||
"response": "json",
|
||||
"max_results": limit
|
||||
}
|
||||
return bindingdb_query("getAffinitiesByBEI", params)
|
||||
|
||||
# Example: Search for imatinib binding data
|
||||
result = search_by_name("imatinib")
|
||||
```
|
||||
|
||||
### 4. Download-Based Analysis (Recommended for Large Queries)
|
||||
|
||||
For comprehensive analyses, download BindingDB data directly:
|
||||
|
||||
```python
|
||||
import pandas as pd
|
||||
|
||||
def load_bindingdb(filepath="BindingDB_All.tsv"):
|
||||
"""
|
||||
Load BindingDB TSV file.
|
||||
Download from: https://www.bindingdb.org/bind/chemsearch/marvin/Download.jsp
|
||||
"""
|
||||
# Key columns
|
||||
usecols = [
|
||||
"BindingDB Reactant_set_id",
|
||||
"Ligand SMILES",
|
||||
"Ligand InChI",
|
||||
"Ligand InChI Key",
|
||||
"BindingDB Target Chain Sequence",
|
||||
"PDB ID(s) for Ligand-Target Complex",
|
||||
"UniProt (SwissProt) Entry Name of Target Chain",
|
||||
"UniProt (SwissProt) Primary ID of Target Chain",
|
||||
"UniProt (TrEMBL) Primary ID of Target Chain",
|
||||
"Ki (nM)",
|
||||
"IC50 (nM)",
|
||||
"Kd (nM)",
|
||||
"EC50 (nM)",
|
||||
"kon (M-1-s-1)",
|
||||
"koff (s-1)",
|
||||
"Target Name",
|
||||
"Target Source Organism According to Curator or DataSource",
|
||||
"Number of Protein Chains in Target (>1 implies a multichain complex)",
|
||||
"PubChem CID",
|
||||
"PubChem SID",
|
||||
"ChEMBL ID of Ligand",
|
||||
"DrugBank ID of Ligand",
|
||||
]
|
||||
|
||||
df = pd.read_csv(filepath, sep="\t", usecols=[c for c in usecols if c],
|
||||
low_memory=False, on_bad_lines='skip')
|
||||
|
||||
# Convert affinity columns to numeric
|
||||
for col in ["Ki (nM)", "IC50 (nM)", "Kd (nM)", "EC50 (nM)"]:
|
||||
if col in df.columns:
|
||||
df[col] = pd.to_numeric(df[col], errors='coerce')
|
||||
|
||||
return df
|
||||
|
||||
def query_target_affinity(df, uniprot_id, affinity_types=None, max_nm=10000):
|
||||
"""Query loaded BindingDB for a specific target."""
|
||||
if affinity_types is None:
|
||||
affinity_types = ["Ki (nM)", "IC50 (nM)", "Kd (nM)"]
|
||||
|
||||
# Filter by UniProt ID
|
||||
mask = df["UniProt (SwissProt) Primary ID of Target Chain"] == uniprot_id
|
||||
target_df = df[mask].copy()
|
||||
|
||||
# Filter by affinity cutoff
|
||||
has_affinity = pd.Series(False, index=target_df.index)
|
||||
for col in affinity_types:
|
||||
if col in target_df.columns:
|
||||
has_affinity |= target_df[col] <= max_nm
|
||||
|
||||
result = target_df[has_affinity][["Ligand SMILES"] + affinity_types +
|
||||
["PubChem CID", "ChEMBL ID of Ligand"]].dropna(how='all')
|
||||
return result.sort_values(affinity_types[0])
|
||||
```
|
||||
|
||||
### 5. SAR Analysis
|
||||
|
||||
```python
|
||||
import pandas as pd
|
||||
|
||||
def sar_analysis(df, target_uniprot, affinity_col="IC50 (nM)"):
|
||||
"""
|
||||
Structure-activity relationship analysis for a target.
|
||||
Retrieves all compounds with affinity data and ranks by potency.
|
||||
"""
|
||||
target_data = query_target_affinity(df, target_uniprot, [affinity_col])
|
||||
|
||||
if target_data.empty:
|
||||
return target_data
|
||||
|
||||
# Add pIC50 (negative log of IC50 in molar)
|
||||
if affinity_col in target_data.columns:
|
||||
target_data = target_data[target_data[affinity_col].notna()].copy()
|
||||
target_data["pAffinity"] = -((target_data[affinity_col] * 1e-9).apply(
|
||||
lambda x: __import__('math').log10(x)
|
||||
))
|
||||
target_data = target_data.sort_values("pAffinity", ascending=False)
|
||||
|
||||
return target_data
|
||||
|
||||
# Most potent compounds against EGFR (P00533)
|
||||
# sar = sar_analysis(df, "P00533", "IC50 (nM)")
|
||||
# print(sar.head(20))
|
||||
```
|
||||
|
||||
### 6. Polypharmacology Profile
|
||||
|
||||
```python
|
||||
def polypharmacology_profile(df, ligand_smiles_or_name, affinity_cutoff_nM=1000):
|
||||
"""
|
||||
Find all targets a compound binds to.
|
||||
Uses PubChem CID or SMILES for matching.
|
||||
"""
|
||||
# Search by ligand SMILES (exact match)
|
||||
mask = df["Ligand SMILES"] == ligand_smiles_or_name
|
||||
|
||||
ligand_data = df[mask].copy()
|
||||
|
||||
# Filter by affinity
|
||||
aff_cols = ["Ki (nM)", "IC50 (nM)", "Kd (nM)"]
|
||||
has_aff = pd.Series(False, index=ligand_data.index)
|
||||
for col in aff_cols:
|
||||
if col in ligand_data.columns:
|
||||
has_aff |= ligand_data[col] <= affinity_cutoff_nM
|
||||
|
||||
result = ligand_data[has_aff][
|
||||
["Target Name", "UniProt (SwissProt) Primary ID of Target Chain"] + aff_cols
|
||||
].dropna(how='all')
|
||||
|
||||
return result.sort_values("Ki (nM)")
|
||||
```
|
||||
|
||||
## Query Workflows
|
||||
|
||||
### Workflow 1: Find Best Inhibitors for a Target
|
||||
|
||||
```python
|
||||
import pandas as pd
|
||||
|
||||
def find_best_inhibitors(uniprot_id, affinity_type="IC50 (nM)", top_n=20):
|
||||
"""Find the most potent inhibitors for a target in BindingDB."""
|
||||
df = load_bindingdb("BindingDB_All.tsv") # Load once and reuse
|
||||
result = query_target_affinity(df, uniprot_id, [affinity_type])
|
||||
|
||||
if result.empty:
|
||||
print(f"No data found for {uniprot_id}")
|
||||
return result
|
||||
|
||||
result = result.sort_values(affinity_type).head(top_n)
|
||||
print(f"Top {top_n} inhibitors for {uniprot_id} by {affinity_type}:")
|
||||
for _, row in result.iterrows():
|
||||
print(f" {row['PubChem CID']}: {row[affinity_type]:.1f} nM | SMILES: {row['Ligand SMILES'][:40]}...")
|
||||
return result
|
||||
```
|
||||
|
||||
### Workflow 2: Selectivity Profiling
|
||||
|
||||
1. Get all affinity data for your compound across all targets
|
||||
2. Compare affinity ratios between on-target and off-targets
|
||||
3. Identify selectivity cliffs (structural changes that improve selectivity)
|
||||
4. Cross-reference with ChEMBL for additional selectivity data
|
||||
|
||||
### Workflow 3: Machine Learning Dataset Preparation
|
||||
|
||||
```python
|
||||
def prepare_ml_dataset(df, uniprot_ids, affinity_col="IC50 (nM)",
|
||||
max_affinity_nM=100000, min_count=50):
|
||||
"""Prepare BindingDB data for ML model training."""
|
||||
records = []
|
||||
for uid in uniprot_ids:
|
||||
target_df = query_target_affinity(df, uid, [affinity_col], max_affinity_nM)
|
||||
if len(target_df) >= min_count:
|
||||
target_df = target_df.copy()
|
||||
target_df["target"] = uid
|
||||
records.append(target_df)
|
||||
|
||||
if not records:
|
||||
return pd.DataFrame()
|
||||
|
||||
combined = pd.concat(records)
|
||||
# Add pAffinity (normalized)
|
||||
combined["pAffinity"] = -((combined[affinity_col] * 1e-9).apply(
|
||||
lambda x: __import__('math').log10(max(x, 1e-12))
|
||||
))
|
||||
return combined[["Ligand SMILES", "target", "pAffinity", affinity_col]].dropna()
|
||||
```
|
||||
|
||||
## Key Data Fields
|
||||
|
||||
| Field | Description |
|
||||
|-------|-------------|
|
||||
| `Ligand SMILES` | 2D structure of the compound |
|
||||
| `Ligand InChI Key` | Unique chemical identifier |
|
||||
| `Ki (nM)` | Inhibition constant (equilibrium, functional) |
|
||||
| `Kd (nM)` | Dissociation constant (thermodynamic, binding) |
|
||||
| `IC50 (nM)` | Half-maximal inhibitory concentration |
|
||||
| `EC50 (nM)` | Half-maximal effective concentration |
|
||||
| `kon (M-1-s-1)` | Association rate constant |
|
||||
| `koff (s-1)` | Dissociation rate constant |
|
||||
| `UniProt (SwissProt) Primary ID` | Target UniProt accession |
|
||||
| `Target Name` | Protein name |
|
||||
| `PDB ID(s) for Ligand-Target Complex` | Crystal structures |
|
||||
| `PubChem CID` | PubChem compound ID |
|
||||
| `ChEMBL ID of Ligand` | ChEMBL compound ID |
|
||||
|
||||
## Affinity Interpretation
|
||||
|
||||
| Affinity | Classification | Drug-likeness |
|
||||
|----------|---------------|---------------|
|
||||
| < 1 nM | Sub-nanomolar | Very potent (picomolar range) |
|
||||
| 1–10 nM | Nanomolar | Potent, typical for approved drugs |
|
||||
| 10–100 nM | Moderate | Common lead compounds |
|
||||
| 100–1000 nM | Weak | Fragment/starting point |
|
||||
| > 1000 nM | Very weak | Generally below drug-relevance threshold |
|
||||
|
||||
## Best Practices
|
||||
|
||||
- **Use Ki for direct binding**: Ki reflects true binding affinity independent of enzymatic mechanism
|
||||
- **IC50 context-dependency**: IC50 values depend on substrate concentration (Cheng-Prusoff equation)
|
||||
- **Normalize units**: BindingDB reports in nM; verify units when comparing across studies
|
||||
- **Filter by target organism**: Use `Target Source Organism` to ensure human protein data
|
||||
- **Handle missing values**: Not all compounds have all measurement types
|
||||
- **Cross-reference with ChEMBL**: ChEMBL has more curated activity data for medicinal chemistry
|
||||
|
||||
## Additional Resources
|
||||
|
||||
- **BindingDB website**: https://www.bindingdb.org/
|
||||
- **Data downloads**: https://www.bindingdb.org/bind/chemsearch/marvin/Download.jsp
|
||||
- **API documentation**: https://www.bindingdb.org/bind/BindingDBRESTfulAPI.jsp
|
||||
- **Citation**: Gilson MK et al. (2016) Nucleic Acids Research. PMID: 26481362
|
||||
- **Related resources**: ChEMBL (https://www.ebi.ac.uk/chembl/), PubChem BioAssay
|
||||
@@ -0,0 +1,178 @@
|
||||
# BindingDB Affinity Query Reference
|
||||
|
||||
## Affinity Measurement Types
|
||||
|
||||
### Ki (Inhibition Constant)
|
||||
- **Definition**: Equilibrium constant for inhibitor-enzyme complex dissociation
|
||||
- **Equation**: Ki = [E][I]/[EI]
|
||||
- **Usage**: Enzyme inhibition; preferred for mechanistic studies
|
||||
- **Note**: Independent of substrate concentration (unlike IC50)
|
||||
|
||||
### Kd (Dissociation Constant)
|
||||
- **Definition**: Thermodynamic binding equilibrium constant
|
||||
- **Equation**: Kd = [A][B]/[AB]
|
||||
- **Usage**: Direct binding assays (SPR, ITC, fluorescence anisotropy)
|
||||
- **Note**: True measure of binding strength; lower = tighter binding
|
||||
|
||||
### IC50 (Half-Maximal Inhibitory Concentration)
|
||||
- **Definition**: Concentration of inhibitor that reduces target activity by 50%
|
||||
- **Usage**: Most common in drug discovery; assay-dependent
|
||||
- **Conversion to Ki**: Cheng-Prusoff equation: Ki = IC50 / (1 + [S]/Km)
|
||||
- **Note**: Depends on substrate concentration and assay conditions
|
||||
|
||||
### EC50 (Half-Maximal Effective Concentration)
|
||||
- **Definition**: Concentration that produces 50% of maximal effect
|
||||
- **Usage**: Cell-based assays, agonist studies
|
||||
|
||||
### Kinetics Parameters
|
||||
- **kon**: Association rate constant (M⁻¹s⁻¹); describes how fast complex forms
|
||||
- **koff**: Dissociation rate constant (s⁻¹); describes how fast complex dissociates
|
||||
- **Residence time**: τ = 1/koff; longer residence = more sustained effect
|
||||
- **Kd from kinetics**: Kd = koff/kon
|
||||
|
||||
## Common API Query Patterns
|
||||
|
||||
### By UniProt ID (REST API)
|
||||
|
||||
```python
|
||||
import requests
|
||||
|
||||
def query_by_uniprot(uniprot_id, affinity_type="Ki"):
|
||||
"""
|
||||
REST API query for BindingDB affinities by UniProt target ID.
|
||||
"""
|
||||
url = "https://www.bindingdb.org/axis2/services/BDBService/getLigandsByUniprotID"
|
||||
params = {
|
||||
"uniprot_id": uniprot_id,
|
||||
"cutoff": "10000", # nM threshold
|
||||
"affinity_type": affinity_type,
|
||||
"response": "json"
|
||||
}
|
||||
response = requests.get(url, params=params)
|
||||
return response.json()
|
||||
|
||||
# Important targets
|
||||
COMMON_TARGETS = {
|
||||
"ABL1": "P00519", # Imatinib, dasatinib target
|
||||
"EGFR": "P00533", # Erlotinib, gefitinib target
|
||||
"BRAF": "P15056", # Vemurafenib, dabrafenib target
|
||||
"CDK2": "P24941", # Cell cycle kinase
|
||||
"HDAC1": "Q13547", # Histone deacetylase
|
||||
"BRD4": "O60885", # BET bromodomain reader
|
||||
"MDM2": "Q00987", # p53 negative regulator
|
||||
"BCL2": "P10415", # Antiapoptotic protein
|
||||
"PCSK9": "Q8NBP7", # Cholesterol regulator
|
||||
"JAK2": "O60674", # Cytokine signaling kinase
|
||||
}
|
||||
```
|
||||
|
||||
### By PubChem CID (REST API)
|
||||
|
||||
```python
|
||||
def query_by_pubchem_cid(pubchem_cid):
|
||||
"""Get all binding data for a specific compound by PubChem CID."""
|
||||
url = "https://www.bindingdb.org/axis2/services/BDBService/getAffinitiesByCID"
|
||||
params = {"cid": pubchem_cid, "response": "json"}
|
||||
response = requests.get(url, params=params)
|
||||
return response.json()
|
||||
|
||||
# Example: Imatinib PubChem CID = 5291
|
||||
imatinib_data = query_by_pubchem_cid(5291)
|
||||
```
|
||||
|
||||
### By Target Name
|
||||
|
||||
```python
|
||||
def query_by_target_name(target_name, affinity_cutoff=100):
|
||||
"""Query BindingDB by target name."""
|
||||
url = "https://www.bindingdb.org/axis2/services/BDBService/getAffinitiesByTarget"
|
||||
params = {
|
||||
"target_name": target_name,
|
||||
"cutoff": affinity_cutoff,
|
||||
"response": "json"
|
||||
}
|
||||
response = requests.get(url, params=params)
|
||||
return response.json()
|
||||
```
|
||||
|
||||
## Dataset Download Guide
|
||||
|
||||
### Available Files
|
||||
|
||||
| File | Size | Contents |
|
||||
|------|------|---------|
|
||||
| `BindingDB_All.tsv.zip` | ~3.5 GB | All data: ~2.9M records |
|
||||
| `BindingDB_All.sdf.zip` | ~7 GB | All data with 3D structures |
|
||||
| `BindingDB_IC50.tsv` | ~1.5 GB | IC50 data only |
|
||||
| `BindingDB_Ki.tsv` | ~0.8 GB | Ki data only |
|
||||
| `BindingDB_Kd.tsv` | ~0.2 GB | Kd data only |
|
||||
| `BindingDB_EC50.tsv` | ~0.5 GB | EC50 data only |
|
||||
| `tdc_bindingdb_*` | Various | TDC-formatted subsets |
|
||||
|
||||
### Efficient Loading
|
||||
|
||||
```python
|
||||
import pandas as pd
|
||||
|
||||
# For large files, use chunking
|
||||
def load_bindingdb_chunked(filepath, uniprot_ids, affinity_col="Ki (nM)", chunk_size=100000):
|
||||
"""Load BindingDB in chunks to filter for specific targets."""
|
||||
results = []
|
||||
for chunk in pd.read_csv(filepath, sep="\t", chunksize=chunk_size,
|
||||
low_memory=False, on_bad_lines='skip'):
|
||||
# Filter for target
|
||||
mask = chunk["UniProt (SwissProt) Primary ID of Target Chain"].isin(uniprot_ids)
|
||||
if mask.any():
|
||||
results.append(chunk[mask])
|
||||
|
||||
if results:
|
||||
return pd.concat(results)
|
||||
return pd.DataFrame()
|
||||
```
|
||||
|
||||
## pKi / pIC50 Conversion
|
||||
|
||||
Converting raw affinity to logarithmic scale (common in ML):
|
||||
|
||||
```python
|
||||
import numpy as np
|
||||
|
||||
def to_log_affinity(affinity_nM):
|
||||
"""Convert nM affinity to pAffinity (negative log molar)."""
|
||||
affinity_M = affinity_nM * 1e-9 # Convert nM to M
|
||||
return -np.log10(affinity_M)
|
||||
|
||||
# Examples:
|
||||
# 1 nM → pAffinity = 9.0
|
||||
# 10 nM → pAffinity = 8.0
|
||||
# 100 nM → pAffinity = 7.0
|
||||
# 1 μM → pAffinity = 6.0
|
||||
# 10 μM → pAffinity = 5.0
|
||||
```
|
||||
|
||||
## Quality Filters
|
||||
|
||||
When using BindingDB data for ML or SAR:
|
||||
|
||||
```python
|
||||
def filter_quality(df):
|
||||
"""Apply quality filters to BindingDB data."""
|
||||
# 1. Require valid SMILES
|
||||
df = df[df["Ligand SMILES"].notna() & (df["Ligand SMILES"] != "")]
|
||||
|
||||
# 2. Require valid affinity
|
||||
df = df[df["Ki (nM)"].notna() | df["IC50 (nM)"].notna()]
|
||||
|
||||
# 3. Filter extreme values (artifacts)
|
||||
for col in ["Ki (nM)", "IC50 (nM)", "Kd (nM)"]:
|
||||
if col in df.columns:
|
||||
df = df[~(df[col] > 1e6)] # Remove > 1 mM (non-specific)
|
||||
|
||||
# 4. Use only human targets
|
||||
if "Target Source Organism According to Curator or DataSource" in df.columns:
|
||||
df = df[df["Target Source Organism According to Curator or DataSource"].str.contains(
|
||||
"Homo sapiens", na=False
|
||||
)]
|
||||
|
||||
return df
|
||||
```
|
||||
Reference in New Issue
Block a user