Files
claude-scientific-skills/scientific-skills/cbioportal-database/references/study_exploration.md
huangkuanlin 7f94783fab Add scVelo RNA velocity analysis workflow and IQ-TREE reference documentation
- Introduced a comprehensive RNA velocity analysis pipeline using scVelo, including data loading, preprocessing, velocity estimation, and visualization.
- Added a script for running RNA velocity analysis with customizable parameters and output options.
- Created detailed documentation for IQ-TREE 2 phylogenetic inference, covering command syntax, model selection, bootstrapping methods, and output interpretation.
- Included references for velocity models and their mathematical framework, along with a comparison of different models.
- Enhanced the scVelo skill documentation with installation instructions, use cases, and best practices for RNA velocity analysis.
2026-03-03 07:15:36 -05:00

4.3 KiB

cBioPortal Study Exploration Reference

Major Study Collections

TCGA (The Cancer Genome Atlas)

Study ID Cancer Type Samples
brca_tcga Breast Cancer ~1,000
luad_tcga Lung Adenocarcinoma ~500
lusc_tcga Lung Squamous Cell Carcinoma ~500
coadread_tcga Colorectal Cancer ~600
gbm_tcga Glioblastoma ~600
prad_tcga Prostate Cancer ~500
skcm_tcga Skin Cutaneous Melanoma ~450
blca_tcga Bladder Urothelial Carcinoma ~400
hnsc_tcga Head and Neck Squamous ~500
lihc_tcga Liver Hepatocellular Carcinoma ~370
stad_tcga Stomach Adenocarcinoma ~440
ucec_tcga Uterine Endometrial Carcinoma ~550
ov_tcga Ovarian Serous Carcinoma ~580
kirc_tcga Kidney Renal Clear Cell Carcinoma ~530
thca_tcga Thyroid Cancer ~500
paad_tcga Pancreatic Adenocarcinoma ~180
laml_tcga Acute Myeloid Leukemia ~200
acc_tcga Adrenocortical Carcinoma ~90

TCGA Pan-Cancer

Study ID Description
tcga_pan_can_atlas_2018 TCGA Pan-Cancer Atlas (32 cancer types, ~10K samples)

MSK-IMPACT (Memorial Sloan Kettering)

Study ID Description
msk_impact_2017 MSK-IMPACT clinical sequencing
mskcc_pd MSK pediatric solid tumors

AACR Project GENIE

Study ID Description
genie_14_1_public GENIE v14.1 (multi-center clinical sequencing)

Molecular Profile ID Naming Conventions

Molecular profile IDs are structured as {studyId}_{type}:

Type Suffix Alteration Type
_mutations Somatic mutations (MAF)
_gistic Copy number (GISTIC discrete: -2, -1, 0, 1, 2)
_cna Copy number (continuous log2 ratio)
_mrna mRNA expression (z-scores or log2)
_rna_seq_v2_mrna RNA-seq (RSEM)
_rna_seq_v2_mrna_median_Zscores RNA-seq z-scores relative to normals
_rppa RPPA protein expression
_rppa_Zscores RPPA z-scores
_sv Structural variants/fusions
_methylation_hm450 DNA methylation (450K array)

Example: For brca_tcga:

  • brca_tcga_mutations — mutation data
  • brca_tcga_gistic — CNA data
  • brca_tcga_rna_seq_v2_mrna — RNA-seq expression

Sample List Categories

Each study has sample lists of different subsets:

Category sampleListId Pattern Contents
all_cases_in_study {studyId}_all All samples
all_cases_with_mutation_data {studyId}_sequenced Sequenced samples only
all_cases_with_cna_data {studyId}_cna Samples with CNA data
all_cases_with_mrna_data {studyId}_mrna Samples with expression
all_cases_with_rppa_data {studyId}_rppa Samples with RPPA
all_complete_cases {studyId}_complete Complete multiplatform data

Common Gene Entrez IDs

Gene Entrez ID Role
TP53 7157 Tumor suppressor
PIK3CA 5290 Oncogene
KRAS 3845 Oncogene
BRCA1 672 Tumor suppressor
BRCA2 675 Tumor suppressor
PTEN 5728 Tumor suppressor
EGFR 1956 Oncogene
MYC 4609 Oncogene
RB1 5925 Tumor suppressor
APC 324 Tumor suppressor
CDKN2A 1029 Tumor suppressor
IDH1 3417 Oncogene (mutant)
BRAF 673 Oncogene
CDH1 999 Tumor suppressor
VHL 7428 Tumor suppressor

Mutation Type Classifications

mutationType Description
Missense_Mutation Amino acid change
Nonsense_Mutation Premature stop codon
Frame_Shift_Del Frameshift deletion
Frame_Shift_Ins Frameshift insertion
Splice_Site Splice site mutation
In_Frame_Del In-frame deletion
In_Frame_Ins In-frame insertion
Translation_Start_Site Start codon mutation
Nonstop_Mutation Stop codon mutation
Silent Synonymous
5'Flank 5' flanking
3'UTR 3' UTR

OncoPrint Color Legend

cBioPortal uses consistent colors in OncoPrint:

  • Red: Amplification
  • Blue (dark): Deep deletion
  • Green: Missense mutation
  • Black: Truncating mutation
  • Purple: Fusion
  • Orange: mRNA upregulation
  • Teal: mRNA downregulation