mirror of
https://github.com/K-Dense-AI/claude-scientific-skills.git
synced 2026-03-27 07:09:27 +08:00
feat: add PrimeKG Precision Medicine Knowledge Graph skill
This commit is contained in:
97
scientific-skills/primekg/SKILL.md
Normal file
97
scientific-skills/primekg/SKILL.md
Normal file
@@ -0,0 +1,97 @@
|
||||
---
|
||||
name: primekg
|
||||
description: Query the Precision Medicine Knowledge Graph (PrimeKG) for multiscale biological data including genes, drugs, diseases, phenotypes, and more.
|
||||
license: Unknown
|
||||
metadata:
|
||||
skill-author: K-Dense Inc. (PrimeKG original from Harvard MIMS)
|
||||
---
|
||||
|
||||
# PrimeKG Knowledge Graph Skill
|
||||
|
||||
## Overview
|
||||
|
||||
PrimeKG is a precision medicine knowledge graph that integrates over 20 primary databases and high-quality scientific literature into a single resource. It contains over 100,000 nodes and 4 million edges across 29 relationship types, including drug-target, disease-gene, and phenotype-disease associations.
|
||||
|
||||
**Key capabilities:**
|
||||
- Search for nodes (genes, proteins, drugs, diseases, phenotypes)
|
||||
- Retrieve direct neighbors (associated entities and clinical evidence)
|
||||
- Analyze local disease context (related genes, drugs, phenotypes)
|
||||
- Identify drug-disease paths (potential repurposing opportunities)
|
||||
|
||||
**Data access:** Programmatic access via `query_primekg.py`. Data is stored at `C:\Users\eamon\Documents\Data\PrimeKG\kg.csv`.
|
||||
|
||||
## When to Use This Skill
|
||||
|
||||
This skill should be used when:
|
||||
|
||||
- **Knowledge-based drug discovery:** Identifying targets and mechanisms for diseases.
|
||||
- **Drug repurposing:** Finding existing drugs that might have evidence for new indications.
|
||||
- **Phenotype analysis:** Understanding how symptoms/phenotypes relate to diseases and genes.
|
||||
- **Multiscale biology:** Bridging the gap between molecular targets (genes) and clinical outcomes (diseases).
|
||||
- **Network pharmacology:** Investigating the broader network effects of drug-target interactions.
|
||||
|
||||
## Core Workflow
|
||||
|
||||
### 1. Search for Entities
|
||||
|
||||
Find identifiers for genes, drugs, or diseases.
|
||||
|
||||
```python
|
||||
from scripts.query_primekg import search_nodes
|
||||
|
||||
# Search for Alzheimer's disease nodes
|
||||
results = search_nodes("Alzheimer", node_type="disease")
|
||||
# Returns: [{"id": "EFO_0000249", "type": "disease", "name": "Alzheimer's disease", ...}]
|
||||
```
|
||||
|
||||
### 2. Get Neighbors (Direct Associations)
|
||||
|
||||
Retrieve all connected nodes and relationship types.
|
||||
|
||||
```python
|
||||
from scripts.query_primekg import get_neighbors
|
||||
|
||||
# Get all neighbors of a specific disease ID
|
||||
neighbors = get_neighbors("EFO_0000249")
|
||||
# Returns: List of neighbors like {"neighbor_name": "APOE", "relation": "disease_gene", ...}
|
||||
```
|
||||
|
||||
### 3. Analyze Disease Context
|
||||
|
||||
A high-level function to summarize associations for a disease.
|
||||
|
||||
```python
|
||||
from scripts.query_primekg import get_disease_context
|
||||
|
||||
# Comprehensive summary for a disease
|
||||
context = get_disease_context("Alzheimer's disease")
|
||||
# Access: context['associated_genes'], context['associated_drugs'], context['phenotypes']
|
||||
```
|
||||
|
||||
## Relationship Types in PrimeKG
|
||||
|
||||
The graph contains several key relationship types including:
|
||||
- `protein_protein`: Physical PPIs
|
||||
- `drug_protein`: Drug target/mechanism associations
|
||||
- `disease_gene`: Genetic associations
|
||||
- `drug_disease`: Indications and contraindications
|
||||
- `disease_phenotype`: Clinical signs and symptoms
|
||||
- `gwas`: Genome-wide association studies evidence
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Use specific IDs:** When using `get_neighbors`, ensure you have the correct ID from `search_nodes`.
|
||||
2. **Context first:** Use `get_disease_context` for a broad overview before diving into specific genes or drugs.
|
||||
3. **Filter relationships:** Use the `relation_type` filter in `get_neighbors` to focus on specific evidence (e.g., only `drug_protein`).
|
||||
4. **Multiscale integration:** Combine with `OpenTargets` for deeper genetic evidence or `Semantic Scholar` for the latest literature context.
|
||||
|
||||
## Resources
|
||||
|
||||
### Scripts
|
||||
- `scripts/query_primekg.py`: Core functions for searching and querying the knowledge graph.
|
||||
|
||||
### Data Path
|
||||
- Data: `/mnt/c/Users/eamon/Documents/Data/PrimeKG/kg.csv`
|
||||
- Total nodes: ~129,000
|
||||
- Total edges: ~4,000,000
|
||||
- Database: CSV-based, optimized for pandas querying.
|
||||
Reference in New Issue
Block a user