mirror of
https://github.com/K-Dense-AI/claude-scientific-skills.git
synced 2026-01-26 16:58:56 +08:00
Add neuropixels-analysis skill for extracellular electrophysiology
Adds comprehensive toolkit for analyzing Neuropixels high-density neural recordings using SpikeInterface, Allen Institute, and IBL best practices. Features: - Data loading from SpikeGLX, Open Ephys, and NWB formats - Preprocessing pipelines (filtering, phase shift, CAR, bad channel detection) - Motion/drift estimation and correction - Spike sorting integration (Kilosort4, SpykingCircus2, Mountainsort5) - Quality metrics computation (SNR, ISI violations, presence ratio) - Automated curation using Allen/IBL criteria - AI-assisted visual curation for uncertain units - Export to Phy and NWB formats Supports Neuropixels 1.0 and 2.0 probes. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
358
scientific-skills/neuropixels-analysis/AUTOMATED_CURATION.md
Normal file
358
scientific-skills/neuropixels-analysis/AUTOMATED_CURATION.md
Normal file
@@ -0,0 +1,358 @@
|
||||
# Automated Curation Reference
|
||||
|
||||
Guide to automated spike sorting curation using Bombcell, UnitRefine, and other tools.
|
||||
|
||||
## Why Automated Curation?
|
||||
|
||||
Manual curation is:
|
||||
- **Slow**: Hours per recording session
|
||||
- **Subjective**: Inter-rater variability
|
||||
- **Non-reproducible**: Hard to standardize
|
||||
|
||||
Automated tools provide consistent, reproducible quality classification.
|
||||
|
||||
## Available Tools
|
||||
|
||||
| Tool | Classification | Language | Integration |
|
||||
|------|---------------|----------|-------------|
|
||||
| **Bombcell** | 4-class (single/multi/noise/non-somatic) | Python/MATLAB | SpikeInterface, Phy |
|
||||
| **UnitRefine** | Machine learning-based | Python | SpikeInterface |
|
||||
| **SpikeInterface QM** | Threshold-based | Python | Native |
|
||||
| **UnitMatch** | Cross-session tracking | Python/MATLAB | Kilosort, Bombcell |
|
||||
|
||||
## Bombcell
|
||||
|
||||
### Overview
|
||||
|
||||
Bombcell classifies units into 4 categories:
|
||||
1. **Single somatic units** - Well-isolated single neurons
|
||||
2. **Multi-unit activity (MUA)** - Mixed neuronal signals
|
||||
3. **Noise** - Non-neural artifacts
|
||||
4. **Non-somatic** - Axonal or dendritic signals
|
||||
|
||||
### Installation
|
||||
|
||||
```bash
|
||||
# Python
|
||||
pip install bombcell
|
||||
|
||||
# Or development version
|
||||
git clone https://github.com/Julie-Fabre/bombcell.git
|
||||
cd bombcell/py_bombcell
|
||||
pip install -e .
|
||||
```
|
||||
|
||||
### Basic Usage (Python)
|
||||
|
||||
```python
|
||||
import bombcell as bc
|
||||
|
||||
# Load sorted data (Kilosort output)
|
||||
kilosort_folder = '/path/to/kilosort/output'
|
||||
raw_data_path = '/path/to/recording.ap.bin'
|
||||
|
||||
# Run Bombcell
|
||||
results = bc.run_bombcell(
|
||||
kilosort_folder,
|
||||
raw_data_path,
|
||||
sample_rate=30000,
|
||||
n_channels=384
|
||||
)
|
||||
|
||||
# Get classifications
|
||||
unit_labels = results['unit_labels']
|
||||
# 'good' = single unit, 'mua' = multi-unit, 'noise' = noise
|
||||
```
|
||||
|
||||
### Integration with SpikeInterface
|
||||
|
||||
```python
|
||||
import spikeinterface.full as si
|
||||
|
||||
# After spike sorting
|
||||
sorting = si.run_sorter('kilosort4', recording, output_folder='ks4/')
|
||||
|
||||
# Create analyzer and compute required extensions
|
||||
analyzer = si.create_sorting_analyzer(sorting, recording, sparse=True)
|
||||
analyzer.compute('waveforms')
|
||||
analyzer.compute('templates')
|
||||
analyzer.compute('spike_amplitudes')
|
||||
|
||||
# Export to Phy format (Bombcell can read this)
|
||||
si.export_to_phy(analyzer, output_folder='phy_export/')
|
||||
|
||||
# Run Bombcell on Phy export
|
||||
import bombcell as bc
|
||||
results = bc.run_bombcell_phy('phy_export/')
|
||||
```
|
||||
|
||||
### Bombcell Metrics
|
||||
|
||||
Bombcell computes specific metrics for classification:
|
||||
|
||||
| Metric | Description | Used For |
|
||||
|--------|-------------|----------|
|
||||
| `peak_trough_ratio` | Waveform shape | Somatic vs non-somatic |
|
||||
| `spatial_decay` | Amplitude across channels | Noise detection |
|
||||
| `refractory_period_violations` | ISI violations | Single vs multi |
|
||||
| `presence_ratio` | Temporal stability | Unit quality |
|
||||
| `waveform_duration` | Peak-to-trough time | Cell type |
|
||||
|
||||
### Custom Thresholds
|
||||
|
||||
```python
|
||||
# Customize classification thresholds
|
||||
custom_params = {
|
||||
'isi_threshold': 0.01, # ISI violation threshold
|
||||
'presence_threshold': 0.9, # Minimum presence ratio
|
||||
'amplitude_threshold': 20, # Minimum amplitude (μV)
|
||||
'spatial_decay_threshold': 40, # Spatial decay (μm)
|
||||
}
|
||||
|
||||
results = bc.run_bombcell(
|
||||
kilosort_folder,
|
||||
raw_data_path,
|
||||
**custom_params
|
||||
)
|
||||
```
|
||||
|
||||
## SpikeInterface Auto-Curation
|
||||
|
||||
### Threshold-Based Curation
|
||||
|
||||
```python
|
||||
# Compute quality metrics
|
||||
analyzer.compute('quality_metrics')
|
||||
qm = analyzer.get_extension('quality_metrics').get_data()
|
||||
|
||||
# Define curation function
|
||||
def auto_curate(qm):
|
||||
labels = {}
|
||||
for unit_id in qm.index:
|
||||
row = qm.loc[unit_id]
|
||||
|
||||
# Classification logic
|
||||
if row['snr'] < 2 or row['presence_ratio'] < 0.5:
|
||||
labels[unit_id] = 'noise'
|
||||
elif row['isi_violations_ratio'] > 0.1:
|
||||
labels[unit_id] = 'mua'
|
||||
elif (row['snr'] > 5 and
|
||||
row['isi_violations_ratio'] < 0.01 and
|
||||
row['presence_ratio'] > 0.9):
|
||||
labels[unit_id] = 'good'
|
||||
else:
|
||||
labels[unit_id] = 'unsorted'
|
||||
|
||||
return labels
|
||||
|
||||
unit_labels = auto_curate(qm)
|
||||
|
||||
# Filter by label
|
||||
good_unit_ids = [u for u, l in unit_labels.items() if l == 'good']
|
||||
sorting_curated = sorting.select_units(good_unit_ids)
|
||||
```
|
||||
|
||||
### Using SpikeInterface Curation Module
|
||||
|
||||
```python
|
||||
from spikeinterface.curation import (
|
||||
CurationSorting,
|
||||
MergeUnitsSorting,
|
||||
SplitUnitSorting
|
||||
)
|
||||
|
||||
# Wrap sorting for curation
|
||||
curation = CurationSorting(sorting)
|
||||
|
||||
# Remove noise units
|
||||
noise_units = qm[qm['snr'] < 2].index.tolist()
|
||||
curation.remove_units(noise_units)
|
||||
|
||||
# Merge similar units (based on template similarity)
|
||||
analyzer.compute('template_similarity')
|
||||
similarity = analyzer.get_extension('template_similarity').get_data()
|
||||
|
||||
# Find highly similar pairs
|
||||
import numpy as np
|
||||
threshold = 0.9
|
||||
similar_pairs = np.argwhere(similarity > threshold)
|
||||
# Merge pairs (careful - requires manual review)
|
||||
|
||||
# Get curated sorting
|
||||
sorting_curated = curation.to_sorting()
|
||||
```
|
||||
|
||||
## UnitMatch: Cross-Session Tracking
|
||||
|
||||
Track the same neurons across recording days.
|
||||
|
||||
### Installation
|
||||
|
||||
```bash
|
||||
pip install unitmatch
|
||||
# Or from source
|
||||
git clone https://github.com/EnnyvanBeest/UnitMatch.git
|
||||
```
|
||||
|
||||
### Usage
|
||||
|
||||
```python
|
||||
# After running Bombcell on multiple sessions
|
||||
session_folders = [
|
||||
'/path/to/session1/kilosort/',
|
||||
'/path/to/session2/kilosort/',
|
||||
'/path/to/session3/kilosort/',
|
||||
]
|
||||
|
||||
from unitmatch import UnitMatch
|
||||
|
||||
# Run UnitMatch
|
||||
um = UnitMatch(session_folders)
|
||||
um.run()
|
||||
|
||||
# Get matching results
|
||||
matches = um.get_matches()
|
||||
# Returns DataFrame with unit IDs matched across sessions
|
||||
|
||||
# Assign unique IDs
|
||||
unique_ids = um.get_unique_ids()
|
||||
```
|
||||
|
||||
### Integration with Workflow
|
||||
|
||||
```python
|
||||
# Typical workflow:
|
||||
# 1. Spike sort each session
|
||||
# 2. Run Bombcell for quality control
|
||||
# 3. Run UnitMatch for cross-session tracking
|
||||
|
||||
# Session 1
|
||||
sorting1 = si.run_sorter('kilosort4', rec1, output_folder='session1/ks4/')
|
||||
# Run Bombcell
|
||||
labels1 = bc.run_bombcell('session1/ks4/', raw1_path)
|
||||
|
||||
# Session 2
|
||||
sorting2 = si.run_sorter('kilosort4', rec2, output_folder='session2/ks4/')
|
||||
labels2 = bc.run_bombcell('session2/ks4/', raw2_path)
|
||||
|
||||
# Track units across sessions
|
||||
um = UnitMatch(['session1/ks4/', 'session2/ks4/'])
|
||||
matches = um.get_matches()
|
||||
```
|
||||
|
||||
## Semi-Automated Workflow
|
||||
|
||||
Combine automated and manual curation:
|
||||
|
||||
```python
|
||||
# Step 1: Automated classification
|
||||
analyzer.compute('quality_metrics')
|
||||
qm = analyzer.get_extension('quality_metrics').get_data()
|
||||
|
||||
# Auto-label obvious cases
|
||||
auto_labels = {}
|
||||
for unit_id in qm.index:
|
||||
row = qm.loc[unit_id]
|
||||
if row['snr'] < 1.5:
|
||||
auto_labels[unit_id] = 'noise'
|
||||
elif row['snr'] > 8 and row['isi_violations_ratio'] < 0.005:
|
||||
auto_labels[unit_id] = 'good'
|
||||
else:
|
||||
auto_labels[unit_id] = 'needs_review'
|
||||
|
||||
# Step 2: Export uncertain units for manual review
|
||||
needs_review = [u for u, l in auto_labels.items() if l == 'needs_review']
|
||||
|
||||
# Export only uncertain units to Phy
|
||||
sorting_review = sorting.select_units(needs_review)
|
||||
analyzer_review = si.create_sorting_analyzer(sorting_review, recording)
|
||||
analyzer_review.compute('waveforms')
|
||||
analyzer_review.compute('templates')
|
||||
si.export_to_phy(analyzer_review, output_folder='phy_review/')
|
||||
|
||||
# Manual review in Phy: phy template-gui phy_review/params.py
|
||||
|
||||
# Step 3: Load manual labels and merge
|
||||
manual_labels = si.read_phy('phy_review/').get_property('quality')
|
||||
# Combine auto + manual labels for final result
|
||||
```
|
||||
|
||||
## Comparison of Methods
|
||||
|
||||
| Method | Pros | Cons |
|
||||
|--------|------|------|
|
||||
| **Manual (Phy)** | Gold standard, flexible | Slow, subjective |
|
||||
| **SpikeInterface QM** | Fast, reproducible | Simple thresholds only |
|
||||
| **Bombcell** | Multi-class, validated | Requires waveform extraction |
|
||||
| **UnitRefine** | ML-based, learns from data | Needs training data |
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Always visualize** - Don't blindly trust automated results
|
||||
2. **Document thresholds** - Record exact parameters used
|
||||
3. **Validate** - Compare automated vs manual on subset
|
||||
4. **Be conservative** - When in doubt, exclude the unit
|
||||
5. **Report methods** - Include curation criteria in publications
|
||||
|
||||
## Pipeline Example
|
||||
|
||||
```python
|
||||
def curate_sorting(sorting, recording, output_dir):
|
||||
"""Complete curation pipeline."""
|
||||
|
||||
# Create analyzer
|
||||
analyzer = si.create_sorting_analyzer(sorting, recording, sparse=True,
|
||||
folder=f'{output_dir}/analyzer')
|
||||
|
||||
# Compute required extensions
|
||||
analyzer.compute('random_spikes', max_spikes_per_unit=500)
|
||||
analyzer.compute('waveforms')
|
||||
analyzer.compute('templates')
|
||||
analyzer.compute('noise_levels')
|
||||
analyzer.compute('spike_amplitudes')
|
||||
analyzer.compute('quality_metrics')
|
||||
|
||||
qm = analyzer.get_extension('quality_metrics').get_data()
|
||||
|
||||
# Auto-classify
|
||||
labels = {}
|
||||
for unit_id in qm.index:
|
||||
row = qm.loc[unit_id]
|
||||
|
||||
if row['snr'] < 2:
|
||||
labels[unit_id] = 'noise'
|
||||
elif row['isi_violations_ratio'] > 0.1 or row['presence_ratio'] < 0.8:
|
||||
labels[unit_id] = 'mua'
|
||||
elif (row['snr'] > 5 and
|
||||
row['isi_violations_ratio'] < 0.01 and
|
||||
row['presence_ratio'] > 0.9 and
|
||||
row['amplitude_cutoff'] < 0.1):
|
||||
labels[unit_id] = 'good'
|
||||
else:
|
||||
labels[unit_id] = 'unsorted'
|
||||
|
||||
# Summary
|
||||
from collections import Counter
|
||||
print("Classification summary:")
|
||||
print(Counter(labels.values()))
|
||||
|
||||
# Save labels
|
||||
import json
|
||||
with open(f'{output_dir}/unit_labels.json', 'w') as f:
|
||||
json.dump(labels, f)
|
||||
|
||||
# Return good units
|
||||
good_ids = [u for u, l in labels.items() if l == 'good']
|
||||
return sorting.select_units(good_ids), labels
|
||||
|
||||
# Usage
|
||||
sorting_curated, labels = curate_sorting(sorting, recording, 'output/')
|
||||
```
|
||||
|
||||
## References
|
||||
|
||||
- [Bombcell GitHub](https://github.com/Julie-Fabre/bombcell)
|
||||
- [UnitMatch GitHub](https://github.com/EnnyvanBeest/UnitMatch)
|
||||
- [SpikeInterface Curation](https://spikeinterface.readthedocs.io/en/stable/modules/curation.html)
|
||||
- Fabre et al. (2023) "Bombcell: automated curation and cell classification"
|
||||
- van Beest et al. (2024) "UnitMatch: tracking neurons across days with high-density probes"
|
||||
Reference in New Issue
Block a user