mirror of
https://github.com/K-Dense-AI/claude-scientific-skills.git
synced 2026-01-26 16:58:56 +08:00
Fix Reactome database nesting
This commit is contained in:
272
scientific-skills/reactome-database/SKILL.md
Normal file
272
scientific-skills/reactome-database/SKILL.md
Normal file
@@ -0,0 +1,272 @@
|
||||
---
|
||||
name: reactome-database
|
||||
description: "Query Reactome REST API for pathway analysis, enrichment, gene-pathway mapping, disease pathways, molecular interactions, expression analysis, for systems biology studies."
|
||||
---
|
||||
|
||||
# Reactome Database
|
||||
|
||||
## Overview
|
||||
|
||||
Reactome is a free, open-source, curated pathway database with 2,825+ human pathways. Query biological pathways, perform overrepresentation and expression analysis, map genes to pathways, explore molecular interactions via REST API and Python client for systems biology research.
|
||||
|
||||
## When to Use This Skill
|
||||
|
||||
This skill should be used when:
|
||||
- Performing pathway enrichment analysis on gene or protein lists
|
||||
- Analyzing gene expression data to identify relevant biological pathways
|
||||
- Querying specific pathway information, reactions, or molecular interactions
|
||||
- Mapping genes or proteins to biological pathways and processes
|
||||
- Exploring disease-related pathways and mechanisms
|
||||
- Visualizing analysis results in the Reactome Pathway Browser
|
||||
- Conducting comparative pathway analysis across species
|
||||
|
||||
## Core Capabilities
|
||||
|
||||
Reactome provides two main API services and a Python client library:
|
||||
|
||||
### 1. Content Service - Data Retrieval
|
||||
|
||||
Query and retrieve biological pathway data, molecular interactions, and entity information.
|
||||
|
||||
**Common operations:**
|
||||
- Retrieve pathway information and hierarchies
|
||||
- Query specific entities (proteins, reactions, complexes)
|
||||
- Get participating molecules in pathways
|
||||
- Access database version and metadata
|
||||
- Explore pathway compartments and locations
|
||||
|
||||
**API Base URL:** `https://reactome.org/ContentService`
|
||||
|
||||
### 2. Analysis Service - Pathway Analysis
|
||||
|
||||
Perform computational analysis on gene lists and expression data.
|
||||
|
||||
**Analysis types:**
|
||||
- **Overrepresentation Analysis**: Identify statistically significant pathways from gene/protein lists
|
||||
- **Expression Data Analysis**: Analyze gene expression datasets to find relevant pathways
|
||||
- **Species Comparison**: Compare pathway data across different organisms
|
||||
|
||||
**API Base URL:** `https://reactome.org/AnalysisService`
|
||||
|
||||
### 3. reactome2py Python Package
|
||||
|
||||
Python client library that wraps Reactome API calls for easier programmatic access.
|
||||
|
||||
**Installation:**
|
||||
```bash
|
||||
uv pip install reactome2py
|
||||
```
|
||||
|
||||
**Note:** The reactome2py package (version 3.0.0, released January 2021) is functional but not actively maintained. For the most up-to-date functionality, consider using direct REST API calls.
|
||||
|
||||
## Querying Pathway Data
|
||||
|
||||
### Using Content Service REST API
|
||||
|
||||
The Content Service uses REST protocol and returns data in JSON or plain text formats.
|
||||
|
||||
**Get database version:**
|
||||
```python
|
||||
import requests
|
||||
|
||||
response = requests.get("https://reactome.org/ContentService/data/database/version")
|
||||
version = response.text
|
||||
print(f"Reactome version: {version}")
|
||||
```
|
||||
|
||||
**Query a specific entity:**
|
||||
```python
|
||||
import requests
|
||||
|
||||
entity_id = "R-HSA-69278" # Example pathway ID
|
||||
response = requests.get(f"https://reactome.org/ContentService/data/query/{entity_id}")
|
||||
data = response.json()
|
||||
```
|
||||
|
||||
**Get participating molecules in a pathway:**
|
||||
```python
|
||||
import requests
|
||||
|
||||
event_id = "R-HSA-69278"
|
||||
response = requests.get(
|
||||
f"https://reactome.org/ContentService/data/event/{event_id}/participatingPhysicalEntities"
|
||||
)
|
||||
molecules = response.json()
|
||||
```
|
||||
|
||||
### Using reactome2py Package
|
||||
|
||||
```python
|
||||
import reactome2py
|
||||
from reactome2py import content
|
||||
|
||||
# Query pathway information
|
||||
pathway_info = content.query_by_id("R-HSA-69278")
|
||||
|
||||
# Get database version
|
||||
version = content.get_database_version()
|
||||
```
|
||||
|
||||
**For detailed API endpoints and parameters**, refer to `references/api_reference.md` in this skill.
|
||||
|
||||
## Performing Pathway Analysis
|
||||
|
||||
### Overrepresentation Analysis
|
||||
|
||||
Submit a list of gene/protein identifiers to find enriched pathways.
|
||||
|
||||
**Using REST API:**
|
||||
```python
|
||||
import requests
|
||||
|
||||
# Prepare identifier list
|
||||
identifiers = ["TP53", "BRCA1", "EGFR", "MYC"]
|
||||
data = "\n".join(identifiers)
|
||||
|
||||
# Submit analysis
|
||||
response = requests.post(
|
||||
"https://reactome.org/AnalysisService/identifiers/",
|
||||
headers={"Content-Type": "text/plain"},
|
||||
data=data
|
||||
)
|
||||
|
||||
result = response.json()
|
||||
token = result["summary"]["token"] # Save token to retrieve results later
|
||||
|
||||
# Access pathways
|
||||
for pathway in result["pathways"]:
|
||||
print(f"{pathway['stId']}: {pathway['name']} (p-value: {pathway['entities']['pValue']})")
|
||||
```
|
||||
|
||||
**Retrieve analysis by token:**
|
||||
```python
|
||||
# Token is valid for 7 days
|
||||
response = requests.get(f"https://reactome.org/AnalysisService/token/{token}")
|
||||
results = response.json()
|
||||
```
|
||||
|
||||
### Expression Data Analysis
|
||||
|
||||
Analyze gene expression datasets with quantitative values.
|
||||
|
||||
**Input format (TSV with header starting with #):**
|
||||
```
|
||||
#Gene Sample1 Sample2 Sample3
|
||||
TP53 2.5 3.1 2.8
|
||||
BRCA1 1.2 1.5 1.3
|
||||
EGFR 4.5 4.2 4.8
|
||||
```
|
||||
|
||||
**Submit expression data:**
|
||||
```python
|
||||
import requests
|
||||
|
||||
# Read TSV file
|
||||
with open("expression_data.tsv", "r") as f:
|
||||
data = f.read()
|
||||
|
||||
response = requests.post(
|
||||
"https://reactome.org/AnalysisService/identifiers/",
|
||||
headers={"Content-Type": "text/plain"},
|
||||
data=data
|
||||
)
|
||||
|
||||
result = response.json()
|
||||
```
|
||||
|
||||
### Species Projection
|
||||
|
||||
Map identifiers to human pathways exclusively using the `/projection/` endpoint:
|
||||
|
||||
```python
|
||||
response = requests.post(
|
||||
"https://reactome.org/AnalysisService/identifiers/projection/",
|
||||
headers={"Content-Type": "text/plain"},
|
||||
data=data
|
||||
)
|
||||
```
|
||||
|
||||
## Visualizing Results
|
||||
|
||||
Analysis results can be visualized in the Reactome Pathway Browser by constructing URLs with the analysis token:
|
||||
|
||||
```python
|
||||
token = result["summary"]["token"]
|
||||
pathway_id = "R-HSA-69278"
|
||||
url = f"https://reactome.org/PathwayBrowser/#{pathway_id}&DTAB=AN&ANALYSIS={token}"
|
||||
print(f"View results: {url}")
|
||||
```
|
||||
|
||||
## Working with Analysis Tokens
|
||||
|
||||
- Analysis tokens are valid for **7 days**
|
||||
- Tokens allow retrieval of previously computed results without re-submission
|
||||
- Store tokens to access results across sessions
|
||||
- Use `GET /token/{TOKEN}` endpoint to retrieve results
|
||||
|
||||
## Data Formats and Identifiers
|
||||
|
||||
### Supported Identifier Types
|
||||
|
||||
Reactome accepts various identifier formats:
|
||||
- UniProt accessions (e.g., P04637)
|
||||
- Gene symbols (e.g., TP53)
|
||||
- Ensembl IDs (e.g., ENSG00000141510)
|
||||
- EntrezGene IDs (e.g., 7157)
|
||||
- ChEBI IDs for small molecules
|
||||
|
||||
The system automatically detects identifier types.
|
||||
|
||||
### Input Format Requirements
|
||||
|
||||
**For overrepresentation analysis:**
|
||||
- Plain text list of identifiers (one per line)
|
||||
- OR single column in TSV format
|
||||
|
||||
**For expression analysis:**
|
||||
- TSV format with mandatory header row starting with "#"
|
||||
- Column 1: identifiers
|
||||
- Columns 2+: numeric expression values
|
||||
- Use period (.) as decimal separator
|
||||
|
||||
### Output Format
|
||||
|
||||
All API responses return JSON containing:
|
||||
- `pathways`: Array of enriched pathways with statistical metrics
|
||||
- `summary`: Analysis metadata and token
|
||||
- `entities`: Matched and unmapped identifiers
|
||||
- Statistical values: pValue, FDR (false discovery rate)
|
||||
|
||||
## Helper Scripts
|
||||
|
||||
This skill includes `scripts/reactome_query.py`, a helper script for common Reactome operations:
|
||||
|
||||
```bash
|
||||
# Query pathway information
|
||||
python scripts/reactome_query.py query R-HSA-69278
|
||||
|
||||
# Perform overrepresentation analysis
|
||||
python scripts/reactome_query.py analyze gene_list.txt
|
||||
|
||||
# Get database version
|
||||
python scripts/reactome_query.py version
|
||||
```
|
||||
|
||||
## Additional Resources
|
||||
|
||||
- **API Documentation**: https://reactome.org/dev
|
||||
- **User Guide**: https://reactome.org/userguide
|
||||
- **Documentation Portal**: https://reactome.org/documentation
|
||||
- **Data Downloads**: https://reactome.org/download-data
|
||||
- **reactome2py Docs**: https://reactome.github.io/reactome2py/
|
||||
|
||||
For comprehensive API endpoint documentation, see `references/api_reference.md` in this skill.
|
||||
|
||||
## Current Database Statistics (Version 94, September 2025)
|
||||
|
||||
- 2,825 human pathways
|
||||
- 16,002 reactions
|
||||
- 11,630 proteins
|
||||
- 2,176 small molecules
|
||||
- 1,070 drugs
|
||||
- 41,373 literature references
|
||||
Reference in New Issue
Block a user