diff --git a/scientific-skills/imaging-data-commons/SKILL.md b/scientific-skills/imaging-data-commons/SKILL.md
new file mode 100644
index 0000000..d66a882
--- /dev/null
+++ b/scientific-skills/imaging-data-commons/SKILL.md
@@ -0,0 +1,1092 @@
+---
+name: imaging-data-commons
+description: Query and download public cancer imaging data from NCI Imaging Data Commons using idc-index. Use for accessing large-scale radiology (CT, MR, PET) and pathology datasets for AI training or research. No authentication required. Query by metadata, visualize in browser, check licenses.
+license: This skill is provided under the MIT License. IDC data itself has individual licensing (mostly CC-BY, some CC-NC) that must be respected when using the data.
+metadata:
+    skill-author: Andrey Fedorov, @fedorov
+---
+
+# Imaging Data Commons
+
+## Overview
+
+Use the `idc-index` Python package to query and download public cancer imaging data from the National Cancer Institute Imaging Data Commons (IDC). No authentication required for data access.
+
+**Primary tool:** `idc-index` ([GitHub](https://github.com/imagingdatacommons/idc-index))
+
+**Check current data scale for the latest version:**
+
+```python
+from idc_index import IDCClient
+client = IDCClient()
+
+# get IDC data version
+print(client.get_idc_version())
+
+# Get collection count and total series
+stats = client.sql_query("""
+    SELECT   
+        COUNT(DISTINCT collection_id) as collections,
+        COUNT(DISTINCT analysis_result_id) as analysis_results,
+        COUNT(DISTINCT PatientID) as patients,
+        COUNT(DISTINCT StudyInstanceUID) as studies,
+        COUNT(DISTINCT SeriesInstanceUID) as series,
+        SUM(instanceCount) as instances,
+        SUM(series_size_MB)/1000000 as size_TB
+    FROM index
+""")
+print(stats)
+```
+
+**Core workflow:**
+1. Query metadata → `client.sql_query()`
+2. Download DICOM files → `client.download_from_selection()`
+3. Visualize in browser → `client.get_viewer_URL(seriesInstanceUID=...)`
+
+## When to Use This Skill
+
+- Finding publicly available radiology (CT, MR, PET) or pathology (slide microscopy) images
+- Selecting image subsets by cancer type, modality, anatomical site, or other metadata
+- Downloading DICOM data from IDC
+- Checking data licenses before use in research or commercial applications
+- Visualizing medical images in a browser without local DICOM viewer software
+
+## IDC Data Model
+
+IDC adds two grouping levels above the standard DICOM hierarchy (Patient → Study → Series → Instance):
+
+- **collection_id**: Groups patients by disease, modality, or research focus (e.g., `tcga_luad`, `nlst`). A patient belongs to exactly one collection.
+- **analysis_result_id**: Identifies derived objects (segmentations, annotations, radiomics features) across one or more original collections.
+
+Use `collection_id` to find original imaging data, may include annotations deposited along with the images; use `analysis_result_id` to find AI-generated or expert annotations.
+
+**Key identifiers for queries:**
+| Identifier | Scope | Use for |
+|------------|-------|---------|
+| `collection_id` | Dataset grouping | Filtering by project/study |
+| `PatientID` | Patient | Grouping images by patient |
+| `StudyInstanceUID` | DICOM study | Grouping of related series, visualization |
+| `SeriesInstanceUID` | DICOM series | Grouping of related series, visualization |
+
+## Index Tables
+
+The `idc-index` package provides multiple metadata index tables, accessible via SQL or as pandas DataFrames.
+
+**Important:** Use `client.indices_overview` to get current table descriptions and column schemas. This is the authoritative source for available columns and their types — always query it when writing SQL or exploring data structure.
+
+### Available Tables
+
+| Table | Row Granularity | Loaded | Description |
+|-------|-----------------|--------|-------------|
+| `index` | 1 row = 1 DICOM series | Auto | Primary metadata for all current IDC data |
+| `prior_versions_index` | 1 row = 1 DICOM series | Auto | Series from previous IDC releases; for downloading deprecated data |
+| `collections_index` | 1 row = 1 collection | fetch_index() | Collection-level metadata and descriptions |
+| `analysis_results_index` | 1 row = 1 analysis result collection | fetch_index() | Metadata about derived datasets (annotations, segmentations) |
+| `clinical_index` | 1 row = 1 clinical data column | fetch_index() | Dictionary mapping clinical table columns to collections |
+| `sm_index` | 1 row = 1 slide microscopy series | fetch_index() | Slide Microscopy (pathology) series metadata |
+| `sm_instance_index` | 1 row = 1 slide microscopy instance | fetch_index() | Instance-level (SOPInstanceUID) metadata for slide microscopy |
+
+**Auto** = loaded automatically when `IDCClient()` is instantiated
+**fetch_index()** = requires `client.fetch_index("table_name")` to load
+
+### Joining Tables
+
+**Key columns are not explicitly labeled, the following is a subset that can be used in joins.**
+
+| Join Column | Tables | Use Case |
+|-------------|--------|----------|
+| `collection_id` | index, prior_versions_index, collections_index, clinical_index | Link series to collection metadata or clinical data |
+| `SeriesInstanceUID` | index, prior_versions_index, sm_index, sm_instance_index | Link series across tables; connect to slide microscopy details |
+| `StudyInstanceUID` | index, prior_versions_index | Link studies across current and historical data |
+| `PatientID` | index, prior_versions_index | Link patients across current and historical data |
+| `analysis_result_id` | index, analysis_results_index | Link series to analysis result metadata (annotations, segmentations) |
+| `source_DOI` | index, analysis_results_index | Link by publication DOI |
+| `crdc_series_uuid` | index, prior_versions_index | Link by CRDC unique identifier |
+| `Modality` | index, prior_versions_index | Filter by imaging modality |
+
+**Note:** `Subjects`, `Updated`, and `Description` appear in multiple tables but have different meanings (counts vs identifiers, different update contexts).
+
+**Example joins:**
+```python
+from idc_index import IDCClient
+client = IDCClient()
+
+# Join index with collections_index to get cancer types
+client.fetch_index("collections_index")
+result = client.sql_query("""
+    SELECT i.SeriesInstanceUID, i.Modality, c.CancerTypes, c.TumorLocations
+    FROM index i
+    JOIN collections_index c ON i.collection_id = c.collection_id
+    WHERE i.Modality = 'MR'
+    LIMIT 10
+""")
+
+# Join index with sm_index for slide microscopy details
+client.fetch_index("sm_index")
+result = client.sql_query("""
+    SELECT i.collection_id, i.PatientID, s.ObjectiveLensPower, s.min_PixelSpacing_2sf
+    FROM index i
+    JOIN sm_index s ON i.SeriesInstanceUID = s.SeriesInstanceUID
+    LIMIT 10
+""")
+```
+
+### Accessing Index Tables
+
+**Via SQL (recommended for filtering/aggregation):**
+```python
+from idc_index import IDCClient
+client = IDCClient()
+
+# Query the primary index (always available)
+results = client.sql_query("SELECT * FROM index WHERE Modality = 'CT' LIMIT 10")
+
+# Fetch and query additional indices
+client.fetch_index("collections_index")
+collections = client.sql_query("SELECT collection_id, CancerTypes, TumorLocations FROM collections_index")
+
+client.fetch_index("analysis_results_index")
+analysis = client.sql_query("SELECT * FROM analysis_results_index LIMIT 5")
+```
+
+**As pandas DataFrames (direct access):**
+```python
+# Primary index (always available after client initialization)
+df = client.index
+
+# Fetch and access on-demand indices
+client.fetch_index("sm_index")
+sm_df = client.sm_index
+```
+
+### Discovering Table Schemas (Essential for Query Writing)
+
+The `indices_overview` dictionary contains complete schema information for all tables. **Always consult this when writing queries or exploring data structure.**
+
+**DICOM attribute mapping:** Many columns are populated directly from DICOM attributes in the source files. The column description in the schema indicates when a column corresponds to a DICOM attribute (e.g., "DICOM Modality attribute" or references a DICOM tag). This allows leveraging DICOM knowledge when querying — standard DICOM attribute names like `PatientID`, `StudyInstanceUID`, `Modality`, `BodyPartExamined` work as expected.
+
+```python
+from idc_index import IDCClient
+client = IDCClient()
+
+# List all available indices with descriptions
+for name, info in client.indices_overview.items():
+    print(f"\n{name}:")
+    print(f"  Installed: {info['installed']}")
+    print(f"  Description: {info['description']}")
+
+# Get complete schema for a specific index (columns, types, descriptions)
+schema = client.indices_overview["index"]["schema"]
+print(f"\nTable: {schema['table_description']}")
+print("\nColumns:")
+for col in schema['columns']:
+    desc = col.get('description', 'No description')
+    # Description indicates if column is from DICOM attribute
+    print(f"  {col['name']} ({col['type']}): {desc}")
+
+# Find columns that are DICOM attributes (check description for "DICOM" reference)
+dicom_cols = [c['name'] for c in schema['columns'] if 'DICOM' in c.get('description', '').upper()]
+print(f"\nDICOM-sourced columns: {dicom_cols}")
+```
+
+**Alternative: use `get_index_schema()` method:**
+```python
+schema = client.get_index_schema("index")
+# Returns same schema dict: {'table_description': ..., 'columns': [...]}
+```
+
+### Key Columns in Primary `index` Table
+
+Most common columns for queries (use `indices_overview` for complete list and descriptions):
+
+| Column | Type | DICOM | Description |
+|--------|------|-------|-------------|
+| `collection_id` | STRING | No | IDC collection identifier |
+| `analysis_result_id` | STRING | No | If applicable, indicates what analysis results collection given series is part of |
+| `source_DOI` | STRING | No | DOI linking to dataset details; use for learning more about the content and for attribution (see citations below) |
+| `PatientID` | STRING | Yes | Patient identifier |
+| `StudyInstanceUID` | STRING | Yes | DICOM Study UID |
+| `SeriesInstanceUID` | STRING | Yes | DICOM Series UID — use for downloads/viewing |
+| `Modality` | STRING | Yes | Imaging modality (CT, MR, PT, SM, etc.) |
+| `BodyPartExamined` | STRING | Yes | Anatomical region |
+| `SeriesDescription` | STRING | Yes | Description of the series |
+| `Manufacturer` | STRING | Yes | Equipment manufacturer |
+| `StudyDate` | STRING | Yes | Date study was performed |
+| `PatientSex` | STRING | Yes | Patient sex |
+| `PatientAge` | STRING | Yes | Patient age at time of study |
+| `license_short_name` | STRING | No | License type (CC BY 4.0, CC BY-NC 4.0, etc.) |
+| `series_size_MB` | FLOAT | No | Size of series in megabytes |
+| `instanceCount` | INTEGER | No | Number of DICOM instances in series |
+
+**DICOM = Yes**: Column value extracted from the DICOM attribute with the same name. Refer to the [DICOM standard](https://dicom.nema.org/medical/dicom/current/output/chtml/part06/chapter_6.html) for numeric tag mappings. Use standard DICOM knowledge for expected values and formats.
+
+### Clinical Data Access
+
+```python
+# Fetch clinical index (also downloads clinical data tables)
+client.fetch_index("clinical_index")
+
+# Query clinical index to find available tables and their columns
+tables = client.sql_query("SELECT DISTINCT table_name, column_label FROM clinical_index")
+
+# Load a specific clinical table as DataFrame
+clinical_df = client.get_clinical_table("table_name")
+```
+
+## Data Access Options
+
+| Method | Auth Required | Best For |
+|--------|---------------|----------|
+| `idc-index` | No | Key queries and downloads (recommended) |
+| IDC Portal | No | Interactive exploration, manual selection, browser-based download |
+| BigQuery | Yes (GCP account) | Complex queries, full DICOM metadata |
+| DICOMweb proxy | No | Tool integration via DICOMweb API |
+
+**DICOMweb access**
+
+IDC data is available via DICOMweb interface (Google Cloud Healthcare API implementation) for integration with PACS systems and DICOMweb-compatible tools.
+
+| Endpoint | Auth | Use Case |
+|----------|------|----------|
+| Public proxy | No | Testing, moderate queries, daily quota |
+| Google Healthcare | Yes (GCP) | Production use, higher quotas |
+
+See `references/dicomweb_guide.md` for endpoint URLs, code examples, supported operations, and implementation details.
+
+## Installation and Setup
+
+**Required (for basic access):**
+```bash
+pip install --upgrade idc-index
+```
+
+**Important:** New IDC data release will always trigger a new version of `idc-index`. Always use `--upgrade` flag while installing, unless an older version is needed for reproducibility.
+
+**Tested with:** idc-index 0.11.5 (IDC data version v23)
+
+**Optional (for data analysis):**
+```bash
+pip install pandas numpy pydicom
+```
+
+## Core Capabilities
+
+### 1. Data Discovery and Exploration
+
+Discover what imaging collections and data are available in IDC:
+
+```python
+from idc_index import IDCClient
+
+client = IDCClient()
+
+# Get summary statistics from primary index
+query = """
+SELECT
+  collection_id,
+  COUNT(DISTINCT PatientID) as patients,
+  COUNT(DISTINCT SeriesInstanceUID) as series,
+  SUM(series_size_MB) as size_mb
+FROM index
+GROUP BY collection_id
+ORDER BY patients DESC
+"""
+collections_summary = client.sql_query(query)
+
+# For richer collection metadata, use collections_index
+client.fetch_index("collections_index")
+collections_info = client.sql_query("""
+    SELECT collection_id, CancerTypes, TumorLocations, Species, Subjects, SupportingData
+    FROM collections_index
+""")
+
+# For analysis results (annotations, segmentations), use analysis_results_index
+client.fetch_index("analysis_results_index")
+analysis_info = client.sql_query("""
+    SELECT analysis_result_id, analysis_result_title, Subjects, Collections, Modalities
+    FROM analysis_results_index
+""")
+```
+
+**`collections_index`** provides curated metadata per collection: cancer types, tumor locations, species, subject counts, and supporting data types — without needing to aggregate from the primary index.
+
+**`analysis_results_index`** lists derived datasets (AI segmentations, expert annotations, radiomics features) with their source collections and modalities.
+
+### 2. Querying Metadata with SQL
+
+Query the IDC mini-index using SQL to find specific datasets.
+
+**First, explore available values for filter columns:**
+```python
+from idc_index import IDCClient
+
+client = IDCClient()
+
+# Check what Modality values exist
+modalities = client.sql_query("""
+    SELECT DISTINCT Modality, COUNT(*) as series_count
+    FROM index
+    GROUP BY Modality
+    ORDER BY series_count DESC
+""")
+print(modalities)
+
+# Check what BodyPartExamined values exist for MR modality
+body_parts = client.sql_query("""
+    SELECT DISTINCT BodyPartExamined, COUNT(*) as series_count
+    FROM index
+    WHERE Modality = 'MR' AND BodyPartExamined IS NOT NULL
+    GROUP BY BodyPartExamined
+    ORDER BY series_count DESC
+    LIMIT 20
+""")
+print(body_parts)
+```
+
+**Then query with validated filter values:**
+```python
+# Find breast MRI scans (use actual values from exploration above)
+results = client.sql_query("""
+    SELECT
+      collection_id,
+      PatientID,
+      SeriesInstanceUID,
+      Modality,
+      SeriesDescription,
+      license_short_name
+    FROM index
+    WHERE Modality = 'MR'
+      AND BodyPartExamined = 'BREAST'
+    LIMIT 20
+""")
+
+# Access results as pandas DataFrame
+for idx, row in results.iterrows():
+    print(f"Patient: {row['PatientID']}, Series: {row['SeriesInstanceUID']}")
+```
+
+**To filter by cancer type, join with `collections_index`:**
+```python
+client.fetch_index("collections_index")
+results = client.sql_query("""
+    SELECT i.collection_id, i.PatientID, i.SeriesInstanceUID, i.Modality
+    FROM index i
+    JOIN collections_index c ON i.collection_id = c.collection_id
+    WHERE c.CancerTypes LIKE '%Breast%'
+      AND i.Modality = 'MR'
+    LIMIT 20
+""")
+```
+
+**Available metadata fields** (use `client.indices_overview` for complete list):
+- Identifiers: collection_id, PatientID, StudyInstanceUID, SeriesInstanceUID
+- Imaging: Modality, BodyPartExamined, Manufacturer, ManufacturerModelName
+- Clinical: PatientAge, PatientSex, StudyDate
+- Descriptions: StudyDescription, SeriesDescription
+- Licensing: license_short_name
+
+**Note:** Cancer type is in `collections_index.CancerTypes`, not in the primary `index` table.
+
+### 3. Downloading DICOM Files
+
+Download imaging data efficiently from IDC's cloud storage:
+
+**Download entire collection:**
+```python
+from idc_index import IDCClient
+
+client = IDCClient()
+
+# Download small collection (RIDER Pilot ~1GB)
+client.download_from_selection(
+    collection_id="rider_pilot",
+    downloadDir="./data/rider"
+)
+```
+
+**Download specific series:**
+```python
+# First, query for series UIDs
+series_df = client.sql_query("""
+    SELECT SeriesInstanceUID
+    FROM index
+    WHERE Modality = 'CT'
+      AND BodyPartExamined = 'CHEST'
+      AND collection_id = 'nlst'
+    LIMIT 5
+""")
+
+# Download only those series
+client.download_from_selection(
+    seriesInstanceUID=list(series_df['SeriesInstanceUID'].values),
+    downloadDir="./data/lung_ct"
+)
+```
+
+**Custom directory structure:**
+
+Default `dirTemplate`: `%collection_id/%PatientID/%StudyInstanceUID/%Modality_%SeriesInstanceUID`
+
+```python
+# Simplified hierarchy (omit StudyInstanceUID level)
+client.download_from_selection(
+    collection_id="tcga_luad",
+    downloadDir="./data",
+    dirTemplate="%collection_id/%PatientID/%Modality"
+)
+# Results in: ./data/tcga_luad/TCGA-05-4244/CT/
+
+# Flat structure (all files in one directory)
+client.download_from_selection(
+    seriesInstanceUID=list(series_df['SeriesInstanceUID'].values),
+    downloadDir="./data/flat",
+    dirTemplate=""
+)
+# Results in: ./data/flat/*.dcm
+```
+
+### Command-Line Download
+
+The `idc download` command provides command-line access to download functionality without writing Python code. Available after installing `idc-index`.
+
+**Auto-detects input type:** manifest file path, or identifiers (collection_id, PatientID, StudyInstanceUID, SeriesInstanceUID, crdc_series_uuid).
+
+```bash
+# Download entire collection
+idc download rider_pilot --download-dir ./data
+
+# Download specific series by UID
+idc download "1.3.6.1.4.1.9328.50.1.69736" --download-dir ./data
+
+# Download multiple items (comma-separated)
+idc download "tcga_luad,tcga_lusc" --download-dir ./data
+
+# Download from manifest file (auto-detected)
+idc download manifest.txt --download-dir ./data
+```
+
+**Options:**
+
+| Option | Description |
+|--------|-------------|
+| `--download-dir` | Output directory (default: current directory) |
+| `--dir-template` | Directory hierarchy template (default: `%collection_id/%PatientID/%StudyInstanceUID/%Modality_%SeriesInstanceUID`) |
+| `--log-level` | Verbosity: debug, info, warning, error, critical |
+
+**Manifest files:**
+
+Manifest files contain S3 URLs (one per line) and can be:
+- Exported from the IDC Portal after cohort selection
+- Shared by collaborators for reproducible data access
+- Generated programmatically from query results
+
+Format (one S3 URL per line):
+```
+s3://idc-open-data/cb09464a-c5cc-4428-9339-d7fa87cfe837/*
+s3://idc-open-data/88f3990d-bdef-49cd-9b2b-4787767240f2/*
+```
+
+**Example: Generate manifest from Python query:**
+
+```python
+from idc_index import IDCClient
+
+client = IDCClient()
+
+# Query for series URLs
+results = client.sql_query("""
+    SELECT series_aws_url
+    FROM index
+    WHERE collection_id = 'rider_pilot' AND Modality = 'CT'
+""")
+
+# Save as manifest file
+with open('ct_manifest.txt', 'w') as f:
+    for url in results['series_aws_url']:
+        f.write(url + '\n')
+```
+
+Then download:
+```bash
+idc download ct_manifest.txt --download-dir ./ct_data
+```
+
+### 4. Visualizing IDC Images
+
+View DICOM data in browser without downloading:
+
+```python
+from idc_index import IDCClient
+import webbrowser
+
+client = IDCClient()
+
+# First query to get valid UIDs
+results = client.sql_query("""
+    SELECT SeriesInstanceUID, StudyInstanceUID
+    FROM index
+    WHERE collection_id = 'rider_pilot' AND Modality = 'CT'
+    LIMIT 1
+""")
+
+# View single series
+viewer_url = client.get_viewer_URL(seriesInstanceUID=results.iloc[0]['SeriesInstanceUID'])
+webbrowser.open(viewer_url)
+
+# View all series in a study (useful for multi-series exams like MRI protocols)
+viewer_url = client.get_viewer_URL(studyInstanceUID=results.iloc[0]['StudyInstanceUID'])
+webbrowser.open(viewer_url)
+```
+
+The method automatically selects OHIF v3 for radiology or SLIM for slide microscopy. Viewing by study is useful when a DICOM Study contains multiple Series (e.g., T1, T2, DWI sequences from a single MRI session).
+
+### 5. Understanding and Checking Licenses
+
+Check data licensing before use (critical for commercial applications):
+
+```python
+from idc_index import IDCClient
+
+client = IDCClient()
+
+# Check licenses for all collections
+query = """
+SELECT DISTINCT
+  collection_id,
+  license_short_name,
+  COUNT(DISTINCT SeriesInstanceUID) as series_count
+FROM index
+GROUP BY collection_id, license_short_name
+ORDER BY collection_id
+"""
+
+licenses = client.sql_query(query)
+print(licenses)
+```
+
+**License types in IDC:**
+- **CC BY 4.0** / **CC BY 3.0** (~97% of data) - Allows commercial use with attribution
+- **CC BY-NC 4.0** / **CC BY-NC 3.0** (~3% of data) - Non-commercial use only
+- **Custom licenses** (rare) - Some collections have specific terms (e.g., NLM Terms and Conditions)
+
+**Important:** Always check the license before using IDC data in publications or commercial applications. Each DICOM file is tagged with its specific license in metadata.
+
+### Generating Citations for Attribution
+
+The `source_DOI` column contains DOIs linking to publications describing how the data was generated. To satisfy attribution requirements, use `citations_from_selection()` to generate properly formatted citations:
+
+```python
+from idc_index import IDCClient
+
+client = IDCClient()
+
+# Get citations for a collection (APA format by default)
+citations = client.citations_from_selection(collection_id="rider_pilot")
+for citation in citations:
+    print(citation)
+
+# Get citations for specific series
+results = client.sql_query("""
+    SELECT SeriesInstanceUID FROM index
+    WHERE collection_id = 'tcga_luad' LIMIT 5
+""")
+citations = client.citations_from_selection(
+    seriesInstanceUID=list(results['SeriesInstanceUID'].values)
+)
+
+# Alternative format: BibTeX (for LaTeX documents)
+bibtex_citations = client.citations_from_selection(
+    collection_id="tcga_luad",
+    citation_format=IDCClient.CITATION_FORMAT_BIBTEX
+)
+```
+
+**Parameters:**
+- `collection_id`: Filter by collection(s)
+- `patientId`: Filter by patient ID(s)
+- `studyInstanceUID`: Filter by study UID(s)
+- `seriesInstanceUID`: Filter by series UID(s)
+- `citation_format`: Use `IDCClient.CITATION_FORMAT_*` constants:
+  - `CITATION_FORMAT_APA` (default) - APA style
+  - `CITATION_FORMAT_BIBTEX` - BibTeX for LaTeX
+  - `CITATION_FORMAT_JSON` - CSL JSON
+  - `CITATION_FORMAT_TURTLE` - RDF Turtle
+
+**Best practice:** When publishing results using IDC data, include the generated citations to properly attribute the data sources and satisfy license requirements.
+
+### 6. Batch Processing and Filtering
+
+Process large datasets efficiently with filtering:
+
+```python
+from idc_index import IDCClient
+import pandas as pd
+
+client = IDCClient()
+
+# Find chest CT scans from GE scanners
+query = """
+SELECT
+  SeriesInstanceUID,
+  PatientID,
+  collection_id,
+  ManufacturerModelName
+FROM index
+WHERE Modality = 'CT'
+  AND BodyPartExamined = 'CHEST'
+  AND Manufacturer = 'GE MEDICAL SYSTEMS'
+  AND license_short_name = 'CC BY 4.0'
+LIMIT 100
+"""
+
+results = client.sql_query(query)
+
+# Save manifest for later
+results.to_csv('lung_ct_manifest.csv', index=False)
+
+# Download in batches to avoid timeout
+batch_size = 10
+for i in range(0, len(results), batch_size):
+    batch = results.iloc[i:i+batch_size]
+    client.download_from_selection(
+        seriesInstanceUID=list(batch['SeriesInstanceUID'].values),
+        downloadDir=f"./data/batch_{i//batch_size}"
+    )
+```
+
+### 7. Advanced Queries with BigQuery
+
+For queries requiring full DICOM metadata, complex JOINs, or clinical data tables, use Google BigQuery. Requires GCP account with billing enabled.
+
+**Quick reference:**
+- Dataset: `bigquery-public-data.idc_current.*`
+- Main table: `dicom_all` (combined metadata)
+- Full metadata: `dicom_metadata` (all DICOM tags)
+
+See `references/bigquery_guide.md` for setup, table schemas, query patterns, and cost optimization.
+
+### 8. Tool Selection Guide
+
+| Task | Tool | Reference |
+|------|------|-----------|
+| Programmatic queries & downloads | `idc-index` | This document |
+| Interactive exploration | IDC Portal | https://portal.imaging.datacommons.cancer.gov/ |
+| Complex metadata queries | BigQuery | `references/bigquery_guide.md` |
+| 3D visualization & analysis | SlicerIDCBrowser | https://github.com/ImagingDataCommons/SlicerIDCBrowser |
+
+**Default choice:** Use `idc-index` for most tasks (no auth, easy API, batch downloads).
+
+### 9. Integration with Analysis Pipelines
+
+Integrate IDC data into imaging analysis workflows:
+
+**Read downloaded DICOM files:**
+```python
+import pydicom
+import os
+
+# Read DICOM files from downloaded series
+series_dir = "./data/rider/rider_pilot/RIDER-1007893286/CT_1.3.6.1..."
+
+dicom_files = [os.path.join(series_dir, f) for f in os.listdir(series_dir)
+               if f.endswith('.dcm')]
+
+# Load first image
+ds = pydicom.dcmread(dicom_files[0])
+print(f"Patient ID: {ds.PatientID}")
+print(f"Modality: {ds.Modality}")
+print(f"Image shape: {ds.pixel_array.shape}")
+```
+
+**Build 3D volume from CT series:**
+```python
+import pydicom
+import numpy as np
+from pathlib import Path
+
+def load_ct_series(series_path):
+    """Load CT series as 3D numpy array"""
+    files = sorted(Path(series_path).glob('*.dcm'))
+    slices = [pydicom.dcmread(str(f)) for f in files]
+
+    # Sort by slice location
+    slices.sort(key=lambda x: float(x.ImagePositionPatient[2]))
+
+    # Stack into 3D array
+    volume = np.stack([s.pixel_array for s in slices])
+
+    return volume, slices[0]  # Return volume and first slice for metadata
+
+volume, metadata = load_ct_series("./data/lung_ct/series_dir")
+print(f"Volume shape: {volume.shape}")  # (z, y, x)
+```
+
+**Integrate with SimpleITK:**
+```python
+import SimpleITK as sitk
+from pathlib import Path
+
+# Read DICOM series
+series_path = "./data/ct_series"
+reader = sitk.ImageSeriesReader()
+dicom_names = reader.GetGDCMSeriesFileNames(series_path)
+reader.SetFileNames(dicom_names)
+image = reader.Execute()
+
+# Apply processing
+smoothed = sitk.CurvatureFlow(image1=image, timeStep=0.125, numberOfIterations=5)
+
+# Save as NIfTI
+sitk.WriteImage(smoothed, "processed_volume.nii.gz")
+```
+
+## Common Use Cases
+
+### Use Case 1: Find and Download Lung CT Scans for Deep Learning
+
+**Objective:** Build training dataset of lung CT scans from NLST collection
+
+**Steps:**
+```python
+from idc_index import IDCClient
+
+client = IDCClient()
+
+# 1. Query for lung CT scans with specific criteria
+query = """
+SELECT
+  PatientID,
+  SeriesInstanceUID,
+  SeriesDescription
+FROM index
+WHERE collection_id = 'nlst'
+  AND Modality = 'CT'
+  AND BodyPartExamined = 'CHEST'
+  AND license_short_name = 'CC BY 4.0'
+ORDER BY PatientID
+LIMIT 100
+"""
+
+results = client.sql_query(query)
+print(f"Found {len(results)} series from {results['PatientID'].nunique()} patients")
+
+# 2. Download data organized by patient
+client.download_from_selection(
+    seriesInstanceUID=list(results['SeriesInstanceUID'].values),
+    downloadDir="./training_data",
+    dirTemplate="%collection_id/%PatientID/%SeriesInstanceUID"
+)
+
+# 3. Save manifest for reproducibility
+results.to_csv('training_manifest.csv', index=False)
+```
+
+### Use Case 2: Query Brain MRI by Manufacturer for Quality Study
+
+**Objective:** Compare image quality across different MRI scanner manufacturers
+
+**Steps:**
+```python
+from idc_index import IDCClient
+import pandas as pd
+
+client = IDCClient()
+
+# Query for brain MRI grouped by manufacturer
+query = """
+SELECT
+  Manufacturer,
+  ManufacturerModelName,
+  COUNT(DISTINCT SeriesInstanceUID) as num_series,
+  COUNT(DISTINCT PatientID) as num_patients
+FROM index
+WHERE Modality = 'MR'
+  AND BodyPartExamined LIKE '%BRAIN%'
+GROUP BY Manufacturer, ManufacturerModelName
+HAVING num_series >= 10
+ORDER BY num_series DESC
+"""
+
+manufacturers = client.sql_query(query)
+print(manufacturers)
+
+# Download sample from each manufacturer for comparison
+for _, row in manufacturers.head(3).iterrows():
+    mfr = row['Manufacturer']
+    model = row['ManufacturerModelName']
+
+    query = f"""
+    SELECT SeriesInstanceUID
+    FROM index
+    WHERE Manufacturer = '{mfr}'
+      AND ManufacturerModelName = '{model}'
+      AND Modality = 'MR'
+      AND BodyPartExamined LIKE '%BRAIN%'
+    LIMIT 5
+    """
+
+    series = client.sql_query(query)
+    client.download_from_selection(
+        seriesInstanceUID=list(series['SeriesInstanceUID'].values),
+        downloadDir=f"./quality_study/{mfr.replace(' ', '_')}"
+    )
+```
+
+### Use Case 3: Visualize Series Without Downloading
+
+**Objective:** Preview imaging data before committing to download
+
+```python
+from idc_index import IDCClient
+import webbrowser
+
+client = IDCClient()
+
+series_list = client.sql_query("""
+    SELECT SeriesInstanceUID, PatientID, SeriesDescription
+    FROM index
+    WHERE collection_id = 'acrin_nsclc_fdg_pet' AND Modality = 'PT'
+    LIMIT 10
+""")
+
+# Preview each in browser
+for _, row in series_list.iterrows():
+    viewer_url = client.get_viewer_URL(seriesInstanceUID=row['SeriesInstanceUID'])
+    print(f"Patient {row['PatientID']}: {row['SeriesDescription']}")
+    print(f"  View at: {viewer_url}")
+    # webbrowser.open(viewer_url)  # Uncomment to open automatically
+```
+
+For additional visualization options, see the [IDC Portal getting started guide](https://learn.canceridc.dev/portal/getting-started) or [SlicerIDCBrowser](https://github.com/ImagingDataCommons/SlicerIDCBrowser) for 3D Slicer integration.
+
+### Use Case 4: License-Aware Batch Download for Commercial Use
+
+**Objective:** Download only CC-BY licensed data suitable for commercial applications
+
+**Steps:**
+```python
+from idc_index import IDCClient
+
+client = IDCClient()
+
+# Query ONLY for CC BY licensed data (allows commercial use with attribution)
+query = """
+SELECT
+  SeriesInstanceUID,
+  collection_id,
+  PatientID,
+  Modality
+FROM index
+WHERE license_short_name LIKE 'CC BY%'
+  AND license_short_name NOT LIKE '%NC%'
+  AND Modality IN ('CT', 'MR')
+  AND BodyPartExamined IN ('CHEST', 'BRAIN', 'ABDOMEN')
+LIMIT 200
+"""
+
+cc_by_data = client.sql_query(query)
+
+print(f"Found {len(cc_by_data)} CC BY licensed series")
+print(f"Collections: {cc_by_data['collection_id'].unique()}")
+
+# Download with license verification
+client.download_from_selection(
+    seriesInstanceUID=list(cc_by_data['SeriesInstanceUID'].values),
+    downloadDir="./commercial_dataset",
+    dirTemplate="%collection_id/%Modality/%PatientID/%SeriesInstanceUID"
+)
+
+# Save license information
+cc_by_data.to_csv('commercial_dataset_manifest_CC-BY_ONLY.csv', index=False)
+```
+
+## Best Practices
+
+- **Check licenses before use** - Always query the `license_short_name` field and respect licensing terms (CC BY vs CC BY-NC)
+- **Generate citations for attribution** - Use `citations_from_selection()` to get properly formatted citations from `source_DOI` values; include these in publications
+- **Start with small queries** - Use `LIMIT` clause when exploring to avoid long downloads and understand data structure
+- **Use mini-index for simple queries** - Only use BigQuery when you need comprehensive metadata or complex JOINs
+- **Organize downloads with dirTemplate** - Use meaningful directory structures like `%collection_id/%PatientID/%Modality`
+- **Cache query results** - Save DataFrames to CSV files to avoid re-querying and ensure reproducibility
+- **Estimate size first** - Check collection size before downloading - some collection sizes are in terabytes!
+- **Save manifests** - Always save query results with Series UIDs for reproducibility and data provenance
+- **Read documentation** - IDC data structure and metadata fields are documented at https://learn.canceridc.dev/
+- **Use IDC forum** - Search for questons/answers and ask your questions to the IDC maintainers and users at https://discourse.canceridc.dev/
+
+## Troubleshooting
+
+**Issue: `ModuleNotFoundError: No module named 'idc_index'`**
+- **Cause:** idc-index package not installed
+- **Solution:** Install with `pip install --upgrade idc-index`
+
+**Issue: Download fails with connection timeout**
+- **Cause:** Network instability or large download size
+- **Solution:**
+  - Download smaller batches (e.g., 10-20 series at a time)
+  - Check network connection
+  - Use `dirTemplate` to organize downloads by batch
+  - Implement retry logic with delays
+
+**Issue: `BigQuery quota exceeded` or billing errors**
+- **Cause:** BigQuery requires billing-enabled GCP project
+- **Solution:** Use idc-index mini-index for simple queries (no billing required), or see `references/bigquery_guide.md` for cost optimization tips
+
+**Issue: Series UID not found or no data returned**
+- **Cause:** Typo in UID, data not in current IDC version, or wrong field name
+- **Solution:**
+  - Check if data is in current IDC version (some old data may be deprecated)
+  - Use `LIMIT 5` to test query first
+  - Check field names against metadata schema documentation
+
+**Issue: Downloaded DICOM files won't open**
+- **Cause:** Corrupted download or incompatible viewer
+- **Solution:**
+  - Check DICOM object type (Modality and SOPClassUID attributes) - some object types require specialized tools
+  - Verify file integrity (check file sizes)
+  - Use pydicom to validate: `pydicom.dcmread(file, force=True)`
+  - Try different DICOM viewer (3D Slicer, Horos, RadiAnt, QuPath)
+  - Re-download the series
+
+## Common SQL Query Patterns
+
+Quick reference for common queries. For detailed examples with context, see the Core Capabilities section above.
+
+### Discover available filter values
+```python
+# What modalities exist?
+client.sql_query("SELECT DISTINCT Modality FROM index")
+
+# What body parts for a specific modality?
+client.sql_query("""
+    SELECT DISTINCT BodyPartExamined, COUNT(*) as n
+    FROM index WHERE Modality = 'CT' AND BodyPartExamined IS NOT NULL
+    GROUP BY BodyPartExamined ORDER BY n DESC
+""")
+
+# What manufacturers for MR?
+client.sql_query("""
+    SELECT DISTINCT Manufacturer, COUNT(*) as n
+    FROM index WHERE Modality = 'MR'
+    GROUP BY Manufacturer ORDER BY n DESC
+""")
+```
+
+### Find annotations and segmentations
+
+**Note:** Not all image-derived objects belong to analysis result collections. Some annotations are deposited alongside original images. Use DICOM Modality or SOPClassUID to find all derived objects regardless of collection type.
+
+```python
+# Find ALL segmentations and structure sets by DICOM Modality
+# SEG = DICOM Segmentation, RTSTRUCT = Radiotherapy Structure Set
+client.sql_query("""
+    SELECT collection_id, Modality, COUNT(*) as series_count
+    FROM index
+    WHERE Modality IN ('SEG', 'RTSTRUCT')
+    GROUP BY collection_id, Modality
+    ORDER BY series_count DESC
+""")
+
+# Find segmentations for a specific collection (includes non-analysis-result items)
+client.sql_query("""
+    SELECT SeriesInstanceUID, SeriesDescription, analysis_result_id
+    FROM index
+    WHERE collection_id = 'tcga_luad' AND Modality = 'SEG'
+""")
+
+# List analysis result collections (curated derived datasets)
+client.fetch_index("analysis_results_index")
+client.sql_query("""
+    SELECT analysis_result_id, analysis_result_title, Collections, Modalities
+    FROM analysis_results_index
+""")
+
+# Find analysis results for a specific source collection
+client.sql_query("""
+    SELECT analysis_result_id, analysis_result_title
+    FROM analysis_results_index
+    WHERE Collections LIKE '%tcga_luad%'
+""")
+```
+
+### Query slide microscopy data
+```python
+# sm_index has detailed metadata; join with index for collection_id
+client.fetch_index("sm_index")
+client.sql_query("""
+    SELECT i.collection_id, COUNT(*) as slides,
+           MIN(s.min_PixelSpacing_2sf) as min_resolution
+    FROM sm_index s
+    JOIN index i ON s.SeriesInstanceUID = i.SeriesInstanceUID
+    GROUP BY i.collection_id
+    ORDER BY slides DESC
+""")
+```
+
+### Estimate download size
+```python
+# Size for specific criteria
+client.sql_query("""
+    SELECT SUM(series_size_MB) as total_mb, COUNT(*) as series_count
+    FROM index
+    WHERE collection_id = 'nlst' AND Modality = 'CT'
+""")
+```
+
+### Link to clinical data
+```python
+client.fetch_index("clinical_index")
+
+# Find collections with clinical data and their tables
+client.sql_query("""
+    SELECT collection_id, table_name, COUNT(DISTINCT column_label) as columns
+    FROM clinical_index
+    GROUP BY collection_id, table_name
+    ORDER BY collection_id
+""")
+```
+
+## Related Skills
+
+The following skills complement IDC workflows for downstream analysis and visualization:
+
+### DICOM Processing
+- **pydicom** - Read, write, and manipulate downloaded DICOM files. Use for extracting pixel data, reading metadata, anonymization, and format conversion. Essential for working with IDC radiology data (CT, MR, PET).
+
+### Pathology and Slide Microscopy
+- **histolab** - Lightweight tile extraction and preprocessing for whole slide images. Use for basic slide processing, tissue detection, and dataset preparation from IDC slide microscopy data.
+- **pathml** - Full-featured computational pathology toolkit. Use for advanced WSI analysis including multiplexed imaging, nucleus segmentation, and ML model training on pathology data downloaded from IDC.
+
+### Metadata Visualization
+- **matplotlib** - Low-level plotting for full customization. Use for creating static figures summarizing IDC query results (bar charts of modalities, histograms of series counts, etc.).
+- **seaborn** - Statistical visualization with pandas integration. Use for quick exploration of IDC metadata distributions, relationships between variables, and categorical comparisons with attractive defaults.
+- **plotly** - Interactive visualization. Use when you need hover info, zoom, and pan for exploring IDC metadata, or for creating web-embeddable dashboards of collection statistics.
+
+### Data Exploration
+- **exploratory-data-analysis** - Comprehensive EDA on scientific data files. Use after downloading IDC data to understand file structure, quality, and characteristics before analysis.
+
+## Resources
+
+### Schema Reference (Primary Source)
+
+**Always use `client.indices_overview` for current column schemas.** This ensures accuracy with the installed idc-index version:
+
+```python
+# Get all column names and types for any table
+schema = client.indices_overview["index"]["schema"]
+columns = [(c['name'], c['type'], c.get('description', '')) for c in schema['columns']]
+```
+
+### Reference Documentation
+
+- **bigquery_guide.md** - Advanced BigQuery usage guide for complex metadata queries
+- **dicomweb_guide.md** - DICOMweb endpoint URLs, code examples, and Google Healthcare API implementation details
+- **[indices_reference](https://idc-index.readthedocs.io/en/latest/indices_reference.html)** - External documentation for index tables (may be ahead of the installed version)
+
+### External Links
+
+- **IDC Portal**: https://portal.imaging.datacommons.cancer.gov/explore/
+- **Documentation**: https://learn.canceridc.dev/
+- **Tutorials**: https://github.com/ImagingDataCommons/IDC-Tutorials
+- **User Forum**: https://discourse.canceridc.dev/
+- **idc-index GitHub**: https://github.com/ImagingDataCommons/idc-index
+- **Citation**: Fedorov, A., et al. "National Cancer Institute Imaging Data Commons: Toward Transparency, Reproducibility, and Scalability in Imaging Artificial Intelligence." RadioGraphics 43.12 (2023). https://doi.org/10.1148/rg.230180
diff --git a/scientific-skills/imaging-data-commons/references/bigquery_guide.md b/scientific-skills/imaging-data-commons/references/bigquery_guide.md
new file mode 100644
index 0000000..252a3b7
--- /dev/null
+++ b/scientific-skills/imaging-data-commons/references/bigquery_guide.md
@@ -0,0 +1,289 @@
+# BigQuery Guide for IDC
+
+**Tested with:** IDC data version v23
+
+For most queries and downloads, use `idc-index` (see main SKILL.md). This guide covers BigQuery for advanced use cases requiring full DICOM metadata or complex joins.
+
+## Prerequisites
+
+**Requirements:**
+1. Google account
+2. Google Cloud project with billing enabled (first 1 TB/month free)
+3. `google-cloud-bigquery` Python package or BigQuery console access
+
+**Authentication setup:**
+```bash
+# Install Google Cloud SDK, then:
+gcloud auth application-default login
+```
+
+## When to Use BigQuery
+
+Use BigQuery instead of `idc-index` when you need:
+- Full DICOM metadata (all 4000+ tags, not just the ~50 in idc-index)
+- Complex joins across clinical data tables
+- DICOM sequence attributes (nested structures)
+- Queries on fields not in the idc-index mini-index
+
+## Accessing IDC in BigQuery
+
+### Dataset Structure
+
+All IDC tables are in the `bigquery-public-data` BigQuery project.
+
+**Current version (recommended for exploration):**
+- `bigquery-public-data.idc_current.*`
+- `bigquery-public-data.idc_current_clinical.*`
+
+**Versioned datasets (recommended for reproducibility):**
+
+- `bigquery-public-data.idc_v{IDC version}.*`
+- `bigquery-public-data.idc_v{IDC version}_clinical.*`
+
+Always use versioned datasets for reproducible research!
+
+## Key Tables
+
+### dicom_all
+Primary table joining complete DICOM metadata with IDC-specific columns (collection_id, gcs_url, license). Contains all DICOM tags from `dicom_metadata` plus collection and administrative metadata. See [dicom_all.sql](https://github.com/ImagingDataCommons/etl_flow/blob/master/bq/generate_tables_and_views/derived_tables/BQ_Table_Building/derived_data_views/sql/dicom_all.sql) for the exact derivation.
+
+```sql
+SELECT 
+  collection_id,
+  PatientID,
+  StudyInstanceUID, 
+  SeriesInstanceUID,
+  Modality,
+  BodyPartExamined,
+  SeriesDescription,
+  gcs_url,
+  license_short_name
+FROM `bigquery-public-data.idc_current.dicom_all`
+WHERE Modality = 'CT'
+  AND BodyPartExamined = 'CHEST'
+LIMIT 10
+```
+
+### Derived Tables
+
+**segmentations** - DICOM Segmentation objects
+```sql
+SELECT *
+FROM `bigquery-public-data.idc_current.segmentations`
+LIMIT 10
+```
+
+**measurement_groups** - SR TID1500 measurement groups
+**qualitative_measurements** - Coded evaluations
+**quantitative_measurements** - Numeric measurements
+
+### Collection Metadata
+
+**original_collections_metadata** - Collection-level descriptions
+
+```sql
+SELECT
+  collection_id,
+  CancerTypes,
+  TumorLocations,
+  Subjects,
+  src.source_doi,
+  src.ImageTypes,
+  src.license.license_short_name
+FROM `bigquery-public-data.idc_current.original_collections_metadata`,
+UNNEST(Sources) AS src
+WHERE CancerTypes LIKE '%Lung%'
+```
+
+## Common Query Patterns
+
+### Find Collections by Criteria
+
+```sql
+SELECT 
+  collection_id,
+  COUNT(DISTINCT PatientID) as patient_count,
+  COUNT(DISTINCT SeriesInstanceUID) as series_count,
+  ARRAY_AGG(DISTINCT Modality) as modalities
+FROM `bigquery-public-data.idc_current.dicom_all`
+WHERE BodyPartExamined LIKE '%BRAIN%'
+GROUP BY collection_id
+HAVING patient_count > 50
+ORDER BY patient_count DESC
+```
+
+### Get Download URLs
+
+```sql
+SELECT
+  SeriesInstanceUID,
+  gcs_url
+FROM `bigquery-public-data.idc_current.dicom_all`
+WHERE collection_id = 'rider_pilot'
+  AND Modality = 'CT'
+```
+
+### Find Studies with Multiple Modalities
+
+```sql
+SELECT
+  StudyInstanceUID,
+  ARRAY_AGG(DISTINCT Modality) as modalities,
+  COUNT(DISTINCT SeriesInstanceUID) as series_count
+FROM `bigquery-public-data.idc_current.dicom_all`
+GROUP BY StudyInstanceUID
+HAVING ARRAY_LENGTH(ARRAY_AGG(DISTINCT Modality)) > 1
+LIMIT 100
+```
+
+### License Filtering
+
+```sql
+SELECT
+  collection_id,
+  license_short_name,
+  COUNT(*) as instance_count
+FROM `bigquery-public-data.idc_current.dicom_all`
+WHERE license_short_name = 'CC BY 4.0'
+GROUP BY collection_id, license_short_name
+```
+
+### Find Segmentations with Source Images
+
+```sql
+SELECT
+  src.collection_id,
+  seg.SeriesInstanceUID as seg_series,
+  seg.SegmentedPropertyType,
+  src.SeriesInstanceUID as source_series,
+  src.Modality as source_modality
+FROM `bigquery-public-data.idc_current.segmentations` seg
+JOIN `bigquery-public-data.idc_current.dicom_all` src
+  ON seg.segmented_SeriesInstanceUID = src.SeriesInstanceUID
+WHERE src.collection_id = 'qin_prostate_repeatability'
+LIMIT 10
+```
+
+## Using Query Results with idc-index
+
+Combine BigQuery for complex queries with idc-index for downloads (no GCP auth needed for downloads):
+
+```python
+from google.cloud import bigquery
+from idc_index import IDCClient
+
+# Initialize BigQuery client
+# Requires: pip install google-cloud-bigquery
+# Auth: gcloud auth application-default login
+# Project: needed for billing even on public datasets (free tier applies)
+bq_client = bigquery.Client(project="your-gcp-project-id")
+
+# Query for series with specific criteria
+query = """
+SELECT DISTINCT SeriesInstanceUID
+FROM `bigquery-public-data.idc_current.dicom_all`
+WHERE collection_id = 'tcga_luad'
+  AND Modality = 'CT'
+  AND Manufacturer = 'GE MEDICAL SYSTEMS'
+LIMIT 100
+"""
+
+df = bq_client.query(query).to_dataframe()
+print(f"Found {len(df)} GE CT series")
+
+# Download with idc-index (no GCP auth required)
+idc_client = IDCClient()
+idc_client.download_from_selection(
+    seriesInstanceUID=list(df['SeriesInstanceUID'].values),
+    downloadDir="./tcga_luad_thin_ct"
+)
+```
+
+## Cost and Optimization
+
+**Pricing:** $5 per TB scanned (first 1 TB/month free). Most users stay within free tier.
+
+**Minimize data scanned:**
+- Select only needed columns (not `SELECT *`)
+- Filter early with `WHERE` clauses
+- Use `LIMIT` when testing
+- Use `dicom_all` instead of `dicom_metadata` when possible (smaller)
+- Preview queries in BQ console (free, shows bytes to scan)
+
+**Check cost before running:**
+```python
+query_job = client.query(query, job_config=bigquery.QueryJobConfig(dry_run=True))
+print(f"Query will scan {query_job.total_bytes_processed / 1e9:.2f} GB")
+```
+
+**Use materialized tables:** IDC provides both views (`table_name_view`) and materialized tables (`table_name`). Always use the materialized tables (faster, lower cost).
+
+## Clinical Data
+
+Clinical data is in separate datasets with collection-specific tables. Not all collections have clinical data (started in IDC v11).
+
+**List available clinical tables:**
+```sql
+SELECT table_name
+FROM `bigquery-public-data.idc_current_clinical.INFORMATION_SCHEMA.TABLES`
+```
+
+**Query clinical data for a collection:**
+```sql
+-- Example: TCGA-LUAD clinical data
+SELECT *
+FROM `bigquery-public-data.idc_current_clinical.tcga_luad_clinical`
+LIMIT 10
+```
+
+**Join clinical with imaging data:**
+```sql
+SELECT
+  d.PatientID,
+  d.SeriesInstanceUID,
+  d.Modality,
+  c.age_at_diagnosis,
+  c.pathologic_stage
+FROM `bigquery-public-data.idc_current.dicom_all` d
+JOIN `bigquery-public-data.idc_current_clinical.tcga_luad_clinical` c
+  ON d.PatientID = c.dicom_patient_id
+WHERE d.collection_id = 'tcga_luad'
+  AND d.Modality = 'CT'
+LIMIT 20
+```
+
+**Note:** Clinical table schemas vary by collection. Check column names with `INFORMATION_SCHEMA.COLUMNS` before querying.
+
+## Important Notes
+
+- Tables are read-only (public dataset)
+- Schema changes between IDC versions
+- Use versioned datasets for reproducibility
+- Some DICOM sequences >15 levels deep are not extracted
+- Very large sequences (>1MB) may be truncated
+- Always check data license before use
+
+## Common Errors
+
+**Issue: Billing must be enabled**
+- Cause: BigQuery requires a billing-enabled GCP project
+- Solution: Enable billing in Google Cloud Console or use idc-index mini-index instead
+
+**Issue: Query exceeds resource limits**
+- Cause: Query scans too much data or is too complex
+- Solution: Add more specific WHERE filters, use LIMIT, break into smaller queries
+
+**Issue: Column not found**
+- Cause: Field name typo or not in selected table
+- Solution: Check table schema first with `INFORMATION_SCHEMA.COLUMNS`
+
+**Issue: Permission denied**
+- Cause: Not authenticated to Google Cloud
+- Solution: Run `gcloud auth application-default login` or set GOOGLE_APPLICATION_CREDENTIALS
+
+## Resources
+
+- [Understanding the BigQuery DICOM schema](https://docs.cloud.google.com/healthcare-api/docs/how-tos/dicom-bigquery-schema)
+- [BigQuery Query Syntax](https://docs.cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax)
+- [Kaggle Intro to SQL](https://www.kaggle.com/learn/intro-to-sql)
+- [Sample BigQuery queries of IDC data](https://github.com/ImagingDataCommons/idc-bigquery-cookbook)
diff --git a/scientific-skills/imaging-data-commons/references/dicomweb_guide.md b/scientific-skills/imaging-data-commons/references/dicomweb_guide.md
new file mode 100644
index 0000000..248e80c
--- /dev/null
+++ b/scientific-skills/imaging-data-commons/references/dicomweb_guide.md
@@ -0,0 +1,308 @@
+# DICOMweb Guide for IDC
+
+IDC provides DICOMweb access through Google Cloud Healthcare API DICOM stores. This guide covers the implementation specifics and usage patterns.
+
+## When to Use DICOMweb
+
+Use DICOMweb when you need:
+- Integration with PACS systems or DICOMweb-compatible tools
+- Streaming metadata without downloading full files
+- Building custom viewers or web applications
+- Using existing DICOMweb client libraries (OHIF, dicomweb-client, etc.)
+
+For most use cases, `idc-index` is simpler and recommended. Use DICOMweb when you specifically need the DICOMweb protocol.
+
+## Endpoints
+
+### Public Proxy (No Authentication)
+
+```
+https://proxy.imaging.datacommons.cancer.gov/current/viewer-only-no-downloads-see-tinyurl-dot-com-slash-3j3d9jyp/dicomWeb
+```
+
+- Points to the latest IDC version automatically
+- Daily quota applies (suitable for testing and moderate use)
+- No authentication required
+- Note: "viewer-only-no-downloads" in URL is legacy naming with no functional meaning
+
+### Google Healthcare API (Requires Authentication)
+
+```
+https://healthcare.googleapis.com/v1/projects/nci-idc-data/locations/us-central1/datasets/idc/dicomStores/idc-store-v{VERSION}/dicomWeb
+```
+
+Replace `{VERSION}` with the IDC release number. To find the current version:
+
+```python
+from idc_index import IDCClient
+client = IDCClient()
+print(client.get_idc_version())  # e.g., "23" for v23
+```
+
+The Google Healthcare endpoint requires authentication and provides higher quotas. See [Authentication](#authentication-for-google-healthcare-api) section below.
+
+## Implementation Details
+
+IDC DICOMweb is provided through Google Cloud Healthcare API DICOM stores. The implementation follows DICOM PS3.18 Web Services with specific characteristics documented in the [Google Healthcare DICOM conformance statement](https://docs.cloud.google.com/healthcare-api/docs/dicom).
+
+### Supported Operations
+
+| Service | Description | Supported |
+|---------|-------------|-----------|
+| QIDO-RS | Search for DICOM objects | Yes |
+| WADO-RS | Retrieve DICOM objects and metadata | Yes |
+| STOW-RS | Store DICOM objects | No (IDC is read-only) |
+
+**Not supported:** URI Service, Worklist Service, Non-Patient Instance Service, Capabilities Transactions
+
+### Searchable DICOM Tags (QIDO-RS)
+
+The implementation supports a limited set of searchable tags:
+
+| Level | Searchable Tags |
+|-------|-----------------|
+| Study | StudyInstanceUID, PatientName, PatientID, AccessionNumber, ReferringPhysicianName, StudyDate |
+| Series | All study tags + SeriesInstanceUID, Modality |
+| Instance | All series tags + SOPInstanceUID |
+
+**Important:** Only exact matching is supported, except for:
+- StudyDate: supports range queries
+- PatientName: supports fuzzy matching
+
+### Query Limitations
+
+- Maximum results: 5,000 for studies/series searches; 50,000 for instances
+- Maximum offset: 1,000,000
+- DICOM sequence tags larger than ~1 MB are not returned in metadata (BulkDataURI provided instead)
+
+## Code Examples
+
+All examples use the public proxy endpoint. For authenticated access to Google Healthcare, see the [authentication section](#authentication-for-google-healthcare-api).
+
+### Finding UIDs with idc-index
+
+Use `idc-index` to discover data, then use DICOMweb for metadata access:
+
+```python
+from idc_index import IDCClient
+
+client = IDCClient()
+
+# Find studies of interest
+results = client.sql_query("""
+    SELECT StudyInstanceUID, SeriesInstanceUID, PatientID, Modality
+    FROM index
+    WHERE collection_id = 'tcga_luad' AND Modality = 'CT'
+    LIMIT 5
+""")
+
+# Use these UIDs with DICOMweb
+study_uid = results.iloc[0]['StudyInstanceUID']
+series_uid = results.iloc[0]['SeriesInstanceUID']
+print(f"Study: {study_uid}")
+print(f"Series: {series_uid}")
+```
+
+### QIDO-RS: Search by UID
+
+```python
+import requests
+
+base_url = "https://proxy.imaging.datacommons.cancer.gov/current/viewer-only-no-downloads-see-tinyurl-dot-com-slash-3j3d9jyp/dicomWeb"
+
+# Search for a specific study
+study_uid = "1.3.6.1.4.1.14519.5.2.1.6450.9002.307623500513044641407722230440"
+response = requests.get(
+    f"{base_url}/studies",
+    params={"StudyInstanceUID": study_uid},
+    headers={"Accept": "application/dicom+json"}
+)
+
+if response.status_code == 200:
+    studies = response.json()
+    print(f"Found {len(studies)} study")
+```
+
+### QIDO-RS: List Series in a Study
+
+```python
+import requests
+
+base_url = "https://proxy.imaging.datacommons.cancer.gov/current/viewer-only-no-downloads-see-tinyurl-dot-com-slash-3j3d9jyp/dicomWeb"
+study_uid = "1.3.6.1.4.1.14519.5.2.1.6450.9002.307623500513044641407722230440"
+
+response = requests.get(
+    f"{base_url}/studies/{study_uid}/series",
+    headers={"Accept": "application/dicom+json"}
+)
+
+if response.status_code == 200:
+    series_list = response.json()
+    for series in series_list:
+        # DICOM tags are returned as hex codes
+        series_uid = series.get("0020000E", {}).get("Value", [None])[0]
+        modality = series.get("00080060", {}).get("Value", [None])[0]
+        description = series.get("0008103E", {}).get("Value", [""])[0]
+        print(f"{modality}: {description}")
+```
+
+### QIDO-RS: List Instances in a Series
+
+```python
+import requests
+
+base_url = "https://proxy.imaging.datacommons.cancer.gov/current/viewer-only-no-downloads-see-tinyurl-dot-com-slash-3j3d9jyp/dicomWeb"
+study_uid = "1.3.6.1.4.1.14519.5.2.1.6450.9002.307623500513044641407722230440"
+series_uid = "1.3.6.1.4.1.14519.5.2.1.6450.9002.217441095430480124587725641302"
+
+response = requests.get(
+    f"{base_url}/studies/{study_uid}/series/{series_uid}/instances",
+    params={"limit": 10},
+    headers={"Accept": "application/dicom+json"}
+)
+
+if response.status_code == 200:
+    instances = response.json()
+    print(f"Found {len(instances)} instances")
+    for inst in instances[:3]:
+        sop_uid = inst.get("00080018", {}).get("Value", [None])[0]
+        print(f"  SOPInstanceUID: {sop_uid}")
+```
+
+### WADO-RS: Retrieve Series Metadata
+
+```python
+import requests
+
+base_url = "https://proxy.imaging.datacommons.cancer.gov/current/viewer-only-no-downloads-see-tinyurl-dot-com-slash-3j3d9jyp/dicomWeb"
+study_uid = "1.3.6.1.4.1.14519.5.2.1.6450.9002.307623500513044641407722230440"
+series_uid = "1.3.6.1.4.1.14519.5.2.1.6450.9002.217441095430480124587725641302"
+
+response = requests.get(
+    f"{base_url}/studies/{study_uid}/series/{series_uid}/metadata",
+    headers={"Accept": "application/dicom+json"}
+)
+
+if response.status_code == 200:
+    instances = response.json()
+    print(f"Retrieved metadata for {len(instances)} instances")
+
+    # Extract image dimensions from first instance
+    if instances:
+        inst = instances[0]
+        rows = inst.get("00280010", {}).get("Value", [None])[0]
+        cols = inst.get("00280011", {}).get("Value", [None])[0]
+        print(f"Image dimensions: {rows} x {cols}")
+```
+
+### Combined Workflow: idc-index Discovery + DICOMweb Metadata
+
+```python
+from idc_index import IDCClient
+import requests
+
+# Use idc-index for efficient discovery
+idc = IDCClient()
+results = idc.sql_query("""
+    SELECT StudyInstanceUID, SeriesInstanceUID, Modality, SeriesDescription
+    FROM index
+    WHERE collection_id = 'nlst' AND Modality = 'CT'
+    LIMIT 1
+""")
+
+study_uid = results.iloc[0]['StudyInstanceUID']
+series_uid = results.iloc[0]['SeriesInstanceUID']
+print(f"Found: {results.iloc[0]['SeriesDescription']}")
+
+# Use DICOMweb to stream metadata without downloading files
+base_url = "https://proxy.imaging.datacommons.cancer.gov/current/viewer-only-no-downloads-see-tinyurl-dot-com-slash-3j3d9jyp/dicomWeb"
+
+response = requests.get(
+    f"{base_url}/studies/{study_uid}/series/{series_uid}/metadata",
+    headers={"Accept": "application/dicom+json"}
+)
+
+if response.status_code == 200:
+    metadata = response.json()
+    print(f"Retrieved metadata for {len(metadata)} instances without downloading files")
+```
+
+## Common DICOM Tags Reference
+
+DICOMweb returns tags as hexadecimal codes. Common tags:
+
+| Tag | Name | Description |
+|-----|------|-------------|
+| 00080018 | SOPInstanceUID | Unique instance identifier |
+| 00080020 | StudyDate | Date study was performed |
+| 00080060 | Modality | Imaging modality (CT, MR, PT, etc.) |
+| 0008103E | SeriesDescription | Description of series |
+| 00100020 | PatientID | Patient identifier |
+| 0020000D | StudyInstanceUID | Unique study identifier |
+| 0020000E | SeriesInstanceUID | Unique series identifier |
+| 00280010 | Rows | Image height in pixels |
+| 00280011 | Columns | Image width in pixels |
+
+## Authentication for Google Healthcare API
+
+To use the Google Healthcare endpoint with higher quotas:
+
+```python
+from google.auth import default
+from google.auth.transport.requests import Request
+import requests
+
+# Get credentials (requires gcloud auth)
+credentials, project = default()
+credentials.refresh(Request())
+
+# Build authenticated request
+base_url = "https://healthcare.googleapis.com/v1/projects/nci-idc-data/locations/us-central1/datasets/idc/dicomStores/idc-store-v23/dicomWeb"
+
+response = requests.get(
+    f"{base_url}/studies",
+    params={"limit": 5},
+    headers={
+        "Authorization": f"Bearer {credentials.token}",
+        "Accept": "application/dicom+json"
+    }
+)
+```
+
+**Prerequisites:**
+1. Google Cloud SDK installed (`gcloud`)
+2. Authenticated: `gcloud auth application-default login`
+3. Account has access to public Google Cloud datasets
+
+## Troubleshooting
+
+### Issue: 400 Bad Request on search queries
+- **Cause:** Using unsupported search parameters. The implementation only supports specific DICOM tags for filtering.
+- **Solution:** Use UID-based queries (StudyInstanceUID, SeriesInstanceUID). For filtering by Modality or other attributes, use `idc-index` to discover UIDs first, then query DICOMweb with specific UIDs.
+
+### Issue: 403 Forbidden on Google Healthcare endpoint
+- **Cause:** Missing authentication or insufficient permissions
+- **Solution:** Run `gcloud auth application-default login` and ensure your account has access
+
+### Issue: 429 Too Many Requests
+- **Cause:** Rate limit exceeded
+- **Solution:** Add delays between requests, reduce `limit` values, or use authenticated endpoint for higher quotas
+
+### Issue: 204 No Content for valid UIDs
+- **Cause:** UID may be from an older IDC version not in current data
+- **Solution:** Verify UID exists using `idc-index` query first. The proxy points to the latest IDC version.
+
+### Issue: Large metadata responses slow to parse
+- **Cause:** Series with many instances returns large JSON
+- **Solution:** Use `limit` parameter on instance queries, or query specific instances by SOPInstanceUID
+
+### Issue: Response missing expected attributes
+- **Cause:** DICOM sequences larger than ~1 MB are excluded from metadata responses
+- **Solution:** Retrieve the full DICOM instance using WADO-RS instance retrieval if you need all attributes
+
+## Resources
+
+- [Google Healthcare DICOM Conformance Statement](https://docs.cloud.google.com/healthcare-api/docs/dicom)
+- [DICOMweb Standard](https://www.dicomstandard.org/using/dicomweb)
+- [dicomweb-client Python library](https://dicomweb-client.readthedocs.io/)
+- [IDC Documentation](https://learn.canceridc.dev/)