Add integrations

This commit is contained in:
Timothy Kassis
2025-10-20 21:56:27 -07:00
parent ac70437180
commit 07f81f2ff3
29 changed files with 11008 additions and 14 deletions

View File

@@ -7,7 +7,7 @@
}, },
"metadata": { "metadata": {
"description": "Claude scientific skills from K-Dense Inc", "description": "Claude scientific skills from K-Dense Inc",
"version": "1.24.0" "version": "1.28.0"
}, },
"plugins": [ "plugins": [
{ {
@@ -111,7 +111,11 @@
"source": "./", "source": "./",
"strict": false, "strict": false,
"skills": [ "skills": [
"./scientific-integrations/benchling-integration" "./scientific-integrations/benchling-integration",
"./scientific-integrations/dnanexus-integration",
"./scientific-integrations/labarchive-integration",
"./scientific-integrations/omero-integration",
"./scientific-integrations/opentrons-integration"
] ]
} }
] ]

View File

@@ -2,7 +2,7 @@
A comprehensive collection of ready-to-use scientific skills for Claude, curated by the K-Dense team. These skills enable Claude to work with specialized scientific libraries and databases across bioinformatics, cheminformatics, machine learning, materials science, and data analysis. Using these set of skills with Claude Code allows you to create an 'AI Scientist' on your desktop! If you want substantially more advanced capabilties, compute infrastructure and enterprise ready offering check out https://k-dense.ai/. A comprehensive collection of ready-to-use scientific skills for Claude, curated by the K-Dense team. These skills enable Claude to work with specialized scientific libraries and databases across bioinformatics, cheminformatics, machine learning, materials science, and data analysis. Using these set of skills with Claude Code allows you to create an 'AI Scientist' on your desktop! If you want substantially more advanced capabilties, compute infrastructure and enterprise ready offering check out https://k-dense.ai/.
This repository provides access to **20 scientific databases**, **44 scientific packages**, **1 scientific integration**, and **95 unique workflows** covering a wide range of scientific computing tasks. This repository provides access to **20 scientific databases**, **44 scientific packages**, **4 scientific integrations**, and **103 unique workflows** covering a wide range of scientific computing tasks.
## Getting Started ## Getting Started
@@ -129,26 +129,27 @@ After installing the plugin, you can use the skill by just mentioning it. Additi
**Laboratory Information Management Systems (LIMS) & R&D Platforms:** **Laboratory Information Management Systems (LIMS) & R&D Platforms:**
- **Benchling Integration** - Toolkit for integrating with Benchling's R&D platform, providing programmatic access to laboratory data management including registry entities (DNA sequences, proteins), inventory systems (samples, containers, locations), electronic lab notebooks (entries, protocols), workflows (tasks, automation), and data exports using Python SDK and REST API - **Benchling Integration** - Toolkit for integrating with Benchling's R&D platform, providing programmatic access to laboratory data management including registry entities (DNA sequences, proteins), inventory systems (samples, containers, locations), electronic lab notebooks (entries, protocols), workflows (tasks, automation), and data exports using Python SDK and REST API
**Cloud Platforms for Genomics & Biomedical Data:**
- **DNAnexus Integration** - Comprehensive toolkit for working with the DNAnexus cloud platform for genomics and biomedical data analysis. Covers building and deploying apps/applets (Python/Bash), managing data objects (files, records, databases), running analyses and workflows, using the dxpy Python SDK, and configuring app metadata and dependencies (dxapp.json setup, system packages, Docker, assets). Enables processing of FASTQ/BAM/VCF files, bioinformatics pipelines, job execution, workflow orchestration, and platform operations including project management and permissions
**Laboratory Automation:**
- **Opentrons Integration** - Toolkit for creating, editing, and debugging Opentrons Python Protocol API v2 protocols for laboratory automation using Flex and OT-2 robots. Enables automated liquid handling, pipetting workflows, hardware module control (thermocycler, temperature, magnetic, heater-shaker, absorbance plate reader), labware management, and complex protocol development for biological and chemical experiments
**Electronic Lab Notebooks (ELN):**
- **LabArchives Integration** - Toolkit for interacting with LabArchives Electronic Lab Notebook (ELN) REST API. Provides programmatic access to notebooks (backup, retrieval, management), entries (creation, comments, attachments), user authentication, site reports and analytics, and third-party integrations (Protocols.io, GraphPad Prism, SnapGene, Geneious, Jupyter, REDCap). Includes Python scripts for configuration setup, notebook operations, and entry management. Supports multi-regional API endpoints (US, UK, Australia) and OAuth authentication
**Microscopy & Bio-image Data:**
- **OMERO Integration** - Toolkit for interacting with OMERO microscopy data management systems using Python. Provides comprehensive access to microscopy images stored in OMERO servers, including dataset and screening data retrieval, pixel data analysis, annotation and metadata management, regions of interest (ROIs) creation and analysis, batch processing, OMERO.scripts development, and OMERO.tables for structured data storage. Essential for researchers working with high-content screening data, multi-dimensional microscopy datasets, or collaborative image repositories
## TODO: Future Scientific Capabilities ## TODO: Future Scientific Capabilities
### Scientific Integrations ### Scientific Integrations
- **LabArchives** - Electronic lab notebook (ELN) integration for research documentation, protocol management, and collaboration
- **Dotmatics** - Scientific informatics platform integration for data management, inventory, and workflow automation
- **Thermo Fisher Connect** - Integration with Thermo Fisher cloud platform for instrument data, LIMS, and analytics workflows
- **PerkinElmer Signals** - Scientific data management and ELN platform integration - **PerkinElmer Signals** - Scientific data management and ELN platform integration
- **CDD Vault** - Collaborative Drug Discovery platform integration for chemical registration and bioassay data - **CDD Vault** - Collaborative Drug Discovery platform integration for chemical registration and bioassay data
- **Geneious** - Molecular biology and NGS analysis software integration - **Geneious** - Molecular biology and NGS analysis software integration
- **SnapGene** - Molecular cloning and DNA visualization platform integration - **SnapGene** - Molecular cloning and DNA visualization platform integration
- **GraphPad Prism** - Statistics and graphing software integration for publication-quality analysis
- **Synthego ICE** - CRISPR editing analysis platform integration - **Synthego ICE** - CRISPR editing analysis platform integration
- **OpenTrons** - Laboratory automation platform integration for liquid handling protocols
- **TeselaGen** - Synthetic biology design and automation platform integration - **TeselaGen** - Synthetic biology design and automation platform integration
- **Strateos** - Cloud laboratory automation platform integration
- **Jupyter Hub/Lab** - Multi-user scientific computing environment integration
- **Weights & Biases** - Experiment tracking and ML model monitoring integration
- **MLflow** - ML lifecycle management platform integration
- **DVC (Data Version Control)** - Data and ML model versioning integration
- **Omero** - Bio-image data management platform integration
- **Galaxy** - Web-based bioinformatics workflow platform integration - **Galaxy** - Web-based bioinformatics workflow platform integration
- **Nextflow/nf-core** - Workflow management system integration for reproducible pipelines - **Nextflow/nf-core** - Workflow management system integration for reproducible pipelines
- **Seven Bridges** - Genomics analysis platform and workspace integration - **Seven Bridges** - Genomics analysis platform and workspace integration

View File

@@ -0,0 +1,383 @@
---
name: dnanexus-integration
description: Comprehensive toolkit for working with the DNAnexus cloud platform for genomics and biomedical data analysis. Use this skill when users need to build apps/applets, manage data (upload/download files, create records, search data objects), run analyses and workflows, use the dxpy Python SDK, or configure app metadata and dependencies. This applies to tasks involving DNAnexus projects, jobs, data objects (files/records/databases), FASTQ/BAM/VCF files on DNAnexus, bioinformatics pipelines, genomics workflows, or any interaction with the DNAnexus API or command-line tools. The skill covers app development (Python/Bash), data operations, job execution, workflow orchestration, and platform configuration including dxapp.json setup and dependency management (system packages, Docker, assets).
---
# DNAnexus Integration
## Overview
DNAnexus is a cloud-based platform for biomedical data analysis, particularly genomics. This skill provides comprehensive guidance for interacting with DNAnexus through:
- Building and deploying apps and applets (Python/Bash)
- Managing data objects (files, records, databases)
- Running analyses and workflows
- Using the dxpy Python SDK
- Configuring app metadata and dependencies
## When to Use This Skill
Use this skill when working with:
- **App Development**: Creating, building, or modifying DNAnexus apps/applets
- **Data Management**: Uploading, downloading, searching, or organizing files and records
- **Job Execution**: Running analyses, monitoring jobs, creating workflows
- **Python SDK**: Writing scripts using dxpy to interact with the platform
- **Configuration**: Setting up dxapp.json, managing dependencies, using Docker
- **Genomics Workflows**: Processing FASTQ, BAM, VCF, or other bioinformatics files
- **Platform Operations**: Managing projects, permissions, or platform resources
## Core Capabilities
The skill is organized into five main areas, each with detailed reference documentation:
### 1. App Development
**Purpose**: Create executable programs (apps/applets) that run on the DNAnexus platform.
**Key Operations**:
- Generate app skeleton with `dx-app-wizard`
- Write Python or Bash apps with proper entry points
- Handle input/output data objects
- Deploy with `dx build` or `dx build --app`
- Test apps on the platform
**Common Use Cases**:
- Bioinformatics pipelines (alignment, variant calling)
- Data processing workflows
- Quality control and filtering
- Format conversion tools
**Reference**: See `references/app-development.md` for:
- Complete app structure and patterns
- Python entry point decorators
- Input/output handling with dxpy
- Development best practices
- Common issues and solutions
### 2. Data Operations
**Purpose**: Manage files, records, and other data objects on the platform.
**Key Operations**:
- Upload/download files with `dxpy.upload_local_file()` and `dxpy.download_dxfile()`
- Create and manage records with metadata
- Search for data objects by name, properties, or type
- Clone data between projects
- Manage project folders and permissions
**Common Use Cases**:
- Uploading sequencing data (FASTQ files)
- Organizing analysis results
- Searching for specific samples or experiments
- Backing up data across projects
- Managing reference genomes and annotations
**Reference**: See `references/data-operations.md` for:
- Complete file and record operations
- Data object lifecycle (open/closed states)
- Search and discovery patterns
- Project management
- Batch operations
### 3. Job Execution
**Purpose**: Run analyses, monitor execution, and orchestrate workflows.
**Key Operations**:
- Launch jobs with `applet.run()` or `app.run()`
- Monitor job status and logs
- Create subjobs for parallel processing
- Build and run multi-step workflows
- Chain jobs with output references
**Common Use Cases**:
- Running genomics analyses on sequencing data
- Parallel processing of multiple samples
- Multi-step analysis pipelines
- Monitoring long-running computations
- Debugging failed jobs
**Reference**: See `references/job-execution.md` for:
- Complete job lifecycle and states
- Workflow creation and orchestration
- Parallel execution patterns
- Job monitoring and debugging
- Resource management
### 4. Python SDK (dxpy)
**Purpose**: Programmatic access to DNAnexus platform through Python.
**Key Operations**:
- Work with data object handlers (DXFile, DXRecord, DXApplet, etc.)
- Use high-level functions for common tasks
- Make direct API calls for advanced operations
- Create links and references between objects
- Search and discover platform resources
**Common Use Cases**:
- Automation scripts for data management
- Custom analysis pipelines
- Batch processing workflows
- Integration with external tools
- Data migration and organization
**Reference**: See `references/python-sdk.md` for:
- Complete dxpy class reference
- High-level utility functions
- API method documentation
- Error handling patterns
- Common code patterns
### 5. Configuration and Dependencies
**Purpose**: Configure app metadata and manage dependencies.
**Key Operations**:
- Write dxapp.json with inputs, outputs, and run specs
- Install system packages (execDepends)
- Bundle custom tools and resources
- Use assets for shared dependencies
- Integrate Docker containers
- Configure instance types and timeouts
**Common Use Cases**:
- Defining app input/output specifications
- Installing bioinformatics tools (samtools, bwa, etc.)
- Managing Python package dependencies
- Using Docker images for complex environments
- Selecting computational resources
**Reference**: See `references/configuration.md` for:
- Complete dxapp.json specification
- Dependency management strategies
- Docker integration patterns
- Regional and resource configuration
- Example configurations
## Quick Start Examples
### Upload and Analyze Data
```python
import dxpy
# Upload input file
input_file = dxpy.upload_local_file("sample.fastq", project="project-xxxx")
# Run analysis
job = dxpy.DXApplet("applet-xxxx").run({
"reads": dxpy.dxlink(input_file.get_id())
})
# Wait for completion
job.wait_on_done()
# Download results
output_id = job.describe()["output"]["aligned_reads"]["$dnanexus_link"]
dxpy.download_dxfile(output_id, "aligned.bam")
```
### Search and Download Files
```python
import dxpy
# Find BAM files from a specific experiment
files = dxpy.find_data_objects(
classname="file",
name="*.bam",
properties={"experiment": "exp001"},
project="project-xxxx"
)
# Download each file
for file_result in files:
file_obj = dxpy.DXFile(file_result["id"])
filename = file_obj.describe()["name"]
dxpy.download_dxfile(file_result["id"], filename)
```
### Create Simple App
```python
# src/my-app.py
import dxpy
import subprocess
@dxpy.entry_point('main')
def main(input_file, quality_threshold=30):
# Download input
dxpy.download_dxfile(input_file["$dnanexus_link"], "input.fastq")
# Process
subprocess.check_call([
"quality_filter",
"--input", "input.fastq",
"--output", "filtered.fastq",
"--threshold", str(quality_threshold)
])
# Upload output
output_file = dxpy.upload_local_file("filtered.fastq")
return {
"filtered_reads": dxpy.dxlink(output_file)
}
dxpy.run()
```
## Workflow Decision Tree
When working with DNAnexus, follow this decision tree:
1. **Need to create a new executable?**
- Yes → Use **App Development** (references/app-development.md)
- No → Continue to step 2
2. **Need to manage files or data?**
- Yes → Use **Data Operations** (references/data-operations.md)
- No → Continue to step 3
3. **Need to run an analysis or workflow?**
- Yes → Use **Job Execution** (references/job-execution.md)
- No → Continue to step 4
4. **Writing Python scripts for automation?**
- Yes → Use **Python SDK** (references/python-sdk.md)
- No → Continue to step 5
5. **Configuring app settings or dependencies?**
- Yes → Use **Configuration** (references/configuration.md)
Often you'll need multiple capabilities together (e.g., app development + configuration, or data operations + job execution).
## Installation and Authentication
### Install dxpy
```bash
pip install dxpy
```
### Login to DNAnexus
```bash
dx login
```
This authenticates your session and sets up access to projects and data.
### Verify Installation
```bash
dx --version
dx whoami
```
## Common Patterns
### Pattern 1: Batch Processing
Process multiple files with the same analysis:
```python
# Find all FASTQ files
files = dxpy.find_data_objects(
classname="file",
name="*.fastq",
project="project-xxxx"
)
# Launch parallel jobs
jobs = []
for file_result in files:
job = dxpy.DXApplet("applet-xxxx").run({
"input": dxpy.dxlink(file_result["id"])
})
jobs.append(job)
# Wait for all completions
for job in jobs:
job.wait_on_done()
```
### Pattern 2: Multi-Step Pipeline
Chain multiple analyses together:
```python
# Step 1: Quality control
qc_job = qc_applet.run({"reads": input_file})
# Step 2: Alignment (uses QC output)
align_job = align_applet.run({
"reads": qc_job.get_output_ref("filtered_reads")
})
# Step 3: Variant calling (uses alignment output)
variant_job = variant_applet.run({
"bam": align_job.get_output_ref("aligned_bam")
})
```
### Pattern 3: Data Organization
Organize analysis results systematically:
```python
# Create organized folder structure
dxpy.api.project_new_folder(
"project-xxxx",
{"folder": "/experiments/exp001/results", "parents": True}
)
# Upload with metadata
result_file = dxpy.upload_local_file(
"results.txt",
project="project-xxxx",
folder="/experiments/exp001/results",
properties={
"experiment": "exp001",
"sample": "sample1",
"analysis_date": "2025-10-20"
},
tags=["validated", "published"]
)
```
## Best Practices
1. **Error Handling**: Always wrap API calls in try-except blocks
2. **Resource Management**: Choose appropriate instance types for workloads
3. **Data Organization**: Use consistent folder structures and metadata
4. **Cost Optimization**: Archive old data, use appropriate storage classes
5. **Documentation**: Include clear descriptions in dxapp.json
6. **Testing**: Test apps with various input types before production use
7. **Version Control**: Use semantic versioning for apps
8. **Security**: Never hardcode credentials in source code
9. **Logging**: Include informative log messages for debugging
10. **Cleanup**: Remove temporary files and failed jobs
## Resources
This skill includes detailed reference documentation:
### references/
- **app-development.md** - Complete guide to building and deploying apps/applets
- **data-operations.md** - File management, records, search, and project operations
- **job-execution.md** - Running jobs, workflows, monitoring, and parallel processing
- **python-sdk.md** - Comprehensive dxpy library reference with all classes and functions
- **configuration.md** - dxapp.json specification and dependency management
Load these references when you need detailed information about specific operations or when working on complex tasks.
## Getting Help
- Official documentation: https://documentation.dnanexus.com/
- API reference: http://autodoc.dnanexus.com/
- GitHub repository: https://github.com/dnanexus/dx-toolkit
- Support: support@dnanexus.com

View File

@@ -0,0 +1,247 @@
# DNAnexus App Development
## Overview
Apps and applets are executable programs that run on the DNAnexus platform. They can be written in Python or Bash and are deployed with all necessary dependencies and configuration.
## Applets vs Apps
- **Applets**: Data objects that live inside projects. Good for development and testing.
- **Apps**: Versioned, shareable executables that don't live inside projects. Can be published for others to use.
Both are created identically until the final build step. Applets can be converted to apps later.
## Creating an App/Applet
### Using dx-app-wizard
Generate a skeleton app directory structure:
```bash
dx-app-wizard
```
This creates:
- `dxapp.json` - Configuration file
- `src/` - Source code directory
- `resources/` - Bundled dependencies
- `test/` - Test files
### Building and Deploying
Build an applet:
```bash
dx build
```
Build an app:
```bash
dx build --app
```
The build process:
1. Validates dxapp.json configuration
2. Bundles source code and resources
3. Deploys to the platform
4. Returns the applet/app ID
## App Directory Structure
```
my-app/
├── dxapp.json # Metadata and configuration
├── src/
│ └── my-app.py # Main executable (Python)
│ └── my-app.sh # Or Bash script
├── resources/ # Bundled files and dependencies
│ └── tools/
│ └── data/
└── test/ # Test data and scripts
└── test.json
```
## Python App Structure
### Entry Points
Python apps use the `@dxpy.entry_point()` decorator to define functions:
```python
import dxpy
@dxpy.entry_point('main')
def main(input1, input2):
# Process inputs
# Return outputs
return {
"output1": result1,
"output2": result2
}
dxpy.run()
```
### Input/Output Handling
**Inputs**: DNAnexus data objects are represented as dicts containing links:
```python
@dxpy.entry_point('main')
def main(reads_file):
# Convert link to handler
reads_dxfile = dxpy.DXFile(reads_file)
# Download to local filesystem
dxpy.download_dxfile(reads_dxfile.get_id(), "reads.fastq")
# Process file...
```
**Outputs**: Return primitive types directly, convert file outputs to links:
```python
# Upload result file
output_file = dxpy.upload_local_file("output.fastq")
return {
"trimmed_reads": dxpy.dxlink(output_file)
}
```
## Bash App Structure
Bash apps use a simpler shell script approach:
```bash
#!/bin/bash
set -e -x -o pipefail
main() {
# Download inputs
dx download "$reads_file" -o reads.fastq
# Process
process_reads reads.fastq > output.fastq
# Upload outputs
trimmed_reads=$(dx upload output.fastq --brief)
# Set job output
dx-jobutil-add-output trimmed_reads "$trimmed_reads" --class=file
}
```
## Common Development Patterns
### 1. Bioinformatics Pipeline
Download → Process → Upload pattern:
```python
# Download input
dxpy.download_dxfile(input_file_id, "input.fastq")
# Run analysis
subprocess.check_call(["tool", "input.fastq", "output.bam"])
# Upload result
output = dxpy.upload_local_file("output.bam")
return {"aligned_reads": dxpy.dxlink(output)}
```
### 2. Multi-file Processing
```python
# Process multiple inputs
for file_link in input_files:
file_handler = dxpy.DXFile(file_link)
local_path = f"{file_handler.name}"
dxpy.download_dxfile(file_handler.get_id(), local_path)
# Process each file...
```
### 3. Parallel Processing
Apps can spawn subjobs for parallel execution:
```python
# Create subjobs
subjobs = []
for item in input_list:
subjob = dxpy.new_dxjob(
fn_input={"input": item},
fn_name="process_item"
)
subjobs.append(subjob)
# Collect results
results = [job.get_output_ref("result") for job in subjobs]
```
## Execution Environment
Apps run in isolated Linux VMs (Ubuntu 24.04) with:
- Internet access
- DNAnexus API access
- Temporary scratch space in `/home/dnanexus`
- Input files downloaded to job workspace
- Root access for installing dependencies
## Testing Apps
### Local Testing
Test app logic locally before deploying:
```bash
cd my-app
python src/my-app.py
```
### Platform Testing
Run the applet on the platform:
```bash
dx run applet-xxxx -i input1=file-yyyy
```
Monitor job execution:
```bash
dx watch job-zzzz
```
View job logs:
```bash
dx watch job-zzzz --get-streams
```
## Best Practices
1. **Error Handling**: Use try-except blocks and provide informative error messages
2. **Logging**: Print progress and debug information to stdout/stderr
3. **Validation**: Validate inputs before processing
4. **Cleanup**: Remove temporary files when done
5. **Documentation**: Include clear descriptions in dxapp.json
6. **Testing**: Test with various input types and edge cases
7. **Versioning**: Use semantic versioning for apps
## Common Issues
### File Not Found
Ensure files are properly downloaded before accessing:
```python
dxpy.download_dxfile(file_id, local_path)
# Now safe to open local_path
```
### Out of Memory
Specify larger instance type in dxapp.json systemRequirements
### Timeout
Increase timeout in dxapp.json or split into smaller jobs
### Permission Errors
Ensure app has necessary permissions in dxapp.json

View File

@@ -0,0 +1,646 @@
# DNAnexus App Configuration and Dependencies
## Overview
This guide covers configuring apps through dxapp.json metadata and managing dependencies including system packages, Python libraries, and Docker containers.
## dxapp.json Structure
The `dxapp.json` file is the configuration file for DNAnexus apps and applets. It defines metadata, inputs, outputs, execution requirements, and dependencies.
### Minimal Example
```json
{
"name": "my-app",
"title": "My Analysis App",
"summary": "Performs analysis on input files",
"dxapi": "1.0.0",
"version": "1.0.0",
"inputSpec": [],
"outputSpec": [],
"runSpec": {
"interpreter": "python3",
"file": "src/my-app.py",
"distribution": "Ubuntu",
"release": "24.04"
}
}
```
## Metadata Fields
### Required Fields
```json
{
"name": "my-app", // Unique identifier (lowercase, numbers, hyphens, underscores)
"title": "My App", // Human-readable name
"summary": "One line description",
"dxapi": "1.0.0" // API version
}
```
### Optional Metadata
```json
{
"version": "1.0.0", // Semantic version (required for apps)
"description": "Extended description...",
"developerNotes": "Implementation notes...",
"categories": [ // For app discovery
"Read Mapping",
"Variation Calling"
],
"details": { // Arbitrary metadata
"contactEmail": "dev@example.com",
"upstreamVersion": "2.1.0",
"citations": ["doi:10.1000/example"],
"changelog": {
"1.0.0": "Initial release"
}
}
}
```
## Input Specification
Define input parameters:
```json
{
"inputSpec": [
{
"name": "reads",
"label": "Input reads",
"class": "file",
"patterns": ["*.fastq", "*.fastq.gz"],
"optional": false,
"help": "FASTQ file containing sequencing reads"
},
{
"name": "quality_threshold",
"label": "Quality threshold",
"class": "int",
"default": 30,
"optional": true,
"help": "Minimum base quality score"
},
{
"name": "reference",
"label": "Reference genome",
"class": "file",
"patterns": ["*.fa", "*.fasta"],
"suggestions": [
{
"name": "Human GRCh38",
"project": "project-xxxx",
"path": "/references/human_g1k_v37.fasta"
}
]
}
]
}
```
### Input Classes
- `file` - File object
- `record` - Record object
- `applet` - Applet reference
- `string` - Text string
- `int` - Integer number
- `float` - Floating point number
- `boolean` - True/false
- `hash` - Key-value mapping
- `array:class` - Array of specified class
### Input Options
- `name` - Parameter name (required)
- `class` - Data type (required)
- `optional` - Whether parameter is optional (default: false)
- `default` - Default value for optional parameters
- `label` - Display name in UI
- `help` - Description text
- `patterns` - File name patterns (for files)
- `suggestions` - Pre-defined reference data
- `choices` - Allowed values (for strings/numbers)
- `group` - UI grouping
## Output Specification
Define output parameters:
```json
{
"outputSpec": [
{
"name": "aligned_reads",
"label": "Aligned reads",
"class": "file",
"patterns": ["*.bam"],
"help": "BAM file with aligned reads"
},
{
"name": "mapping_stats",
"label": "Mapping statistics",
"class": "record",
"help": "Record containing alignment statistics"
}
]
}
```
## Run Specification
Define how the app executes:
```json
{
"runSpec": {
"interpreter": "python3", // or "bash"
"file": "src/my-app.py", // Entry point script
"distribution": "Ubuntu",
"release": "24.04",
"version": "0", // Distribution version
"execDepends": [ // System packages
{"name": "samtools"},
{"name": "bwa"}
],
"bundledDepends": [ // Bundled resources
{"name": "scripts.tar.gz", "id": {"$dnanexus_link": "file-xxxx"}}
],
"assetDepends": [ // Asset dependencies
{"name": "asset-name", "id": {"$dnanexus_link": "record-xxxx"}}
],
"systemRequirements": {
"*": {
"instanceType": "mem2_ssd1_v2_x4"
}
},
"headJobOnDemand": true,
"restartableEntryPoints": ["main"]
}
}
```
## System Requirements
### Instance Type Selection
```json
{
"systemRequirements": {
"main": {
"instanceType": "mem2_ssd1_v2_x8"
},
"process": {
"instanceType": "mem3_ssd1_v2_x16"
}
}
}
```
**Common instance types**:
- `mem1_ssd1_v2_x2` - 2 cores, 3.9 GB RAM
- `mem1_ssd1_v2_x4` - 4 cores, 7.8 GB RAM
- `mem2_ssd1_v2_x4` - 4 cores, 15.6 GB RAM
- `mem2_ssd1_v2_x8` - 8 cores, 31.2 GB RAM
- `mem3_ssd1_v2_x8` - 8 cores, 62.5 GB RAM
- `mem3_ssd1_v2_x16` - 16 cores, 125 GB RAM
### Cluster Specifications
For distributed computing:
```json
{
"systemRequirements": {
"main": {
"clusterSpec": {
"type": "spark",
"version": "3.1.2",
"initialInstanceCount": 3,
"instanceType": "mem1_ssd1_v2_x4",
"bootstrapScript": "bootstrap.sh"
}
}
}
}
```
## Regional Options
Deploy apps across regions:
```json
{
"regionalOptions": {
"aws:us-east-1": {
"systemRequirements": {
"*": {"instanceType": "mem2_ssd1_v2_x4"}
},
"assetDepends": [
{"id": "record-xxxx"}
]
},
"azure:westus": {
"systemRequirements": {
"*": {"instanceType": "azure:mem2_ssd1_x4"}
}
}
}
}
```
## Dependency Management
### System Packages (execDepends)
Install Ubuntu packages at runtime:
```json
{
"runSpec": {
"execDepends": [
{"name": "samtools"},
{"name": "bwa"},
{"name": "python3-pip"},
{"name": "r-base", "version": "4.0.0"}
]
}
}
```
Packages are installed using `apt-get` from Ubuntu repositories.
### Python Dependencies
#### Option 1: Install via pip in execDepends
```json
{
"runSpec": {
"execDepends": [
{"name": "python3-pip"}
]
}
}
```
Then in your app script:
```python
import subprocess
subprocess.check_call(["pip", "install", "numpy==1.24.0", "pandas==2.0.0"])
```
#### Option 2: Requirements file
Create `resources/requirements.txt`:
```
numpy==1.24.0
pandas==2.0.0
scikit-learn==1.3.0
```
In your app:
```python
subprocess.check_call(["pip", "install", "-r", "requirements.txt"])
```
### Bundled Dependencies
Include custom tools or libraries in the app:
**File structure**:
```
my-app/
├── dxapp.json
├── src/
│ └── my-app.py
└── resources/
├── tools/
│ └── custom_tool
└── scripts/
└── helper.py
```
Access resources in app:
```python
import os
# Resources are in parent directory
resources_dir = os.path.join(os.path.dirname(__file__), "..", "resources")
tool_path = os.path.join(resources_dir, "tools", "custom_tool")
# Run bundled tool
subprocess.check_call([tool_path, "arg1", "arg2"])
```
### Asset Dependencies
Assets are pre-built bundles of dependencies that can be shared across apps.
#### Using Assets
```json
{
"runSpec": {
"assetDepends": [
{
"name": "bwa-asset",
"id": {"$dnanexus_link": "record-xxxx"}
}
]
}
}
```
Assets are mounted at runtime and accessible via environment variable:
```python
import os
asset_dir = os.environ.get("DX_ASSET_BWA")
bwa_path = os.path.join(asset_dir, "bin", "bwa")
```
#### Creating Assets
Create asset directory:
```bash
mkdir bwa-asset
cd bwa-asset
# Install software
./configure --prefix=$PWD/usr/local
make && make install
```
Build asset:
```bash
dx build_asset bwa-asset --destination=project-xxxx:/assets/
```
## Docker Integration
### Using Docker Images
```json
{
"runSpec": {
"interpreter": "python3",
"file": "src/my-app.py",
"distribution": "Ubuntu",
"release": "24.04",
"systemRequirements": {
"*": {
"instanceType": "mem2_ssd1_v2_x4"
}
},
"execDepends": [
{"name": "docker.io"}
]
}
}
```
Use Docker in app:
```python
import subprocess
# Pull Docker image
subprocess.check_call(["docker", "pull", "biocontainers/samtools:v1.9"])
# Run command in container
subprocess.check_call([
"docker", "run",
"-v", f"{os.getcwd()}:/data",
"biocontainers/samtools:v1.9",
"samtools", "view", "/data/input.bam"
])
```
### Docker as Base Image
For apps that run entirely in Docker:
```json
{
"runSpec": {
"interpreter": "bash",
"file": "src/wrapper.sh",
"distribution": "Ubuntu",
"release": "24.04",
"execDepends": [
{"name": "docker.io"}
]
}
}
```
## Access Requirements
Request special permissions:
```json
{
"access": {
"network": ["*"], // Internet access
"project": "CONTRIBUTE", // Project write access
"allProjects": "VIEW", // Read other projects
"developer": true // Advanced permissions
}
}
```
**Network access**:
- `["*"]` - Full internet
- `["github.com", "pypi.org"]` - Specific domains
## Timeout Configuration
```json
{
"runSpec": {
"timeoutPolicy": {
"*": {
"days": 1,
"hours": 12,
"minutes": 30
}
}
}
}
```
## Example: Complete dxapp.json
```json
{
"name": "rna-seq-pipeline",
"title": "RNA-Seq Analysis Pipeline",
"summary": "Aligns RNA-seq reads and quantifies gene expression",
"description": "Comprehensive RNA-seq pipeline using STAR aligner and featureCounts",
"version": "1.0.0",
"dxapi": "1.0.0",
"categories": ["Read Mapping", "RNA-Seq"],
"inputSpec": [
{
"name": "reads",
"label": "FASTQ reads",
"class": "array:file",
"patterns": ["*.fastq.gz", "*.fq.gz"],
"help": "Single-end or paired-end RNA-seq reads"
},
{
"name": "reference_genome",
"label": "Reference genome",
"class": "file",
"patterns": ["*.fa", "*.fasta"],
"suggestions": [
{
"name": "Human GRCh38",
"project": "project-reference",
"path": "/genomes/GRCh38.fa"
}
]
},
{
"name": "gtf_file",
"label": "Gene annotation (GTF)",
"class": "file",
"patterns": ["*.gtf", "*.gtf.gz"]
}
],
"outputSpec": [
{
"name": "aligned_bam",
"label": "Aligned reads (BAM)",
"class": "file",
"patterns": ["*.bam"]
},
{
"name": "counts",
"label": "Gene counts",
"class": "file",
"patterns": ["*.counts.txt"]
},
{
"name": "qc_report",
"label": "QC report",
"class": "file",
"patterns": ["*.html"]
}
],
"runSpec": {
"interpreter": "python3",
"file": "src/rna-seq-pipeline.py",
"distribution": "Ubuntu",
"release": "24.04",
"execDepends": [
{"name": "python3-pip"},
{"name": "samtools"},
{"name": "subread"}
],
"assetDepends": [
{
"name": "star-aligner",
"id": {"$dnanexus_link": "record-star-asset"}
}
],
"systemRequirements": {
"main": {
"instanceType": "mem3_ssd1_v2_x16"
}
},
"timeoutPolicy": {
"*": {"hours": 8}
}
},
"access": {
"network": ["*"]
},
"details": {
"contactEmail": "support@example.com",
"upstreamVersion": "STAR 2.7.10a, Subread 2.0.3",
"citations": ["doi:10.1093/bioinformatics/bts635"]
}
}
```
## Best Practices
1. **Version Management**: Use semantic versioning for apps
2. **Instance Type**: Start with smaller instances, scale up as needed
3. **Dependencies**: Document all dependencies clearly
4. **Error Messages**: Provide helpful error messages for invalid inputs
5. **Testing**: Test with various input types and sizes
6. **Documentation**: Write clear descriptions and help text
7. **Resources**: Bundle frequently-used tools to avoid repeated downloads
8. **Docker**: Use Docker for complex dependency chains
9. **Assets**: Create assets for heavy dependencies shared across apps
10. **Timeouts**: Set reasonable timeouts based on expected runtime
11. **Network Access**: Request only necessary network permissions
12. **Region Support**: Use regionalOptions for multi-region apps
## Common Patterns
### Bioinformatics Tool
```json
{
"inputSpec": [
{"name": "input_file", "class": "file", "patterns": ["*.bam"]},
{"name": "threads", "class": "int", "default": 4, "optional": true}
],
"runSpec": {
"execDepends": [{"name": "tool-name"}],
"systemRequirements": {
"main": {"instanceType": "mem2_ssd1_v2_x8"}
}
}
}
```
### Python Data Analysis
```json
{
"runSpec": {
"interpreter": "python3",
"execDepends": [
{"name": "python3-pip"}
],
"systemRequirements": {
"main": {"instanceType": "mem2_ssd1_v2_x4"}
}
}
}
```
### Docker-based App
```json
{
"runSpec": {
"interpreter": "bash",
"execDepends": [
{"name": "docker.io"}
],
"systemRequirements": {
"main": {"instanceType": "mem2_ssd1_v2_x8"}
}
},
"access": {
"network": ["*"]
}
}
```

View File

@@ -0,0 +1,400 @@
# DNAnexus Data Operations
## Overview
DNAnexus provides comprehensive data management capabilities for files, records, databases, and other data objects. All data operations can be performed via the Python SDK (dxpy) or command-line interface (dx).
## Data Object Types
### Files
Binary or text data stored on the platform.
### Records
Structured data objects with arbitrary JSON details and metadata.
### Databases
Structured database objects for relational data.
### Applets and Apps
Executable programs (covered in app-development.md).
### Workflows
Multi-step analysis pipelines.
## Data Object Lifecycle
### States
**Open State**: Data can be modified
- Files: Contents can be written
- Records: Details can be updated
- Applets: Created in closed state by default
**Closed State**: Data becomes immutable
- File contents are fixed
- Metadata fields are locked (types, details, links, visibility)
- Objects are ready for sharing and analysis
### Transitions
```
Create (open) → Modify → Close (immutable)
```
Most objects start open and require explicit closure:
```python
# Close a file
file_obj.close()
```
Some objects can be created and closed in one operation:
```python
# Create closed record
record = dxpy.new_dxrecord(details={...}, close=True)
```
## File Operations
### Uploading Files
**From local file**:
```python
import dxpy
# Upload a file
file_obj = dxpy.upload_local_file("data.txt", project="project-xxxx")
print(f"Uploaded: {file_obj.get_id()}")
```
**With metadata**:
```python
file_obj = dxpy.upload_local_file(
"data.txt",
name="my_data",
project="project-xxxx",
folder="/results",
properties={"sample": "sample1", "type": "raw"},
tags=["experiment1", "batch2"]
)
```
**Streaming upload**:
```python
# For large files or generated data
file_obj = dxpy.new_dxfile(project="project-xxxx", name="output.txt")
file_obj.write("Line 1\n")
file_obj.write("Line 2\n")
file_obj.close()
```
### Downloading Files
**To local file**:
```python
# Download by ID
dxpy.download_dxfile("file-xxxx", "local_output.txt")
# Download using handler
file_obj = dxpy.DXFile("file-xxxx")
dxpy.download_dxfile(file_obj.get_id(), "local_output.txt")
```
**Read file contents**:
```python
file_obj = dxpy.DXFile("file-xxxx")
with file_obj.open_file() as f:
contents = f.read()
```
**Download to specific directory**:
```python
dxpy.download_dxfile("file-xxxx", "/path/to/directory/filename.txt")
```
### File Metadata
**Get file information**:
```python
file_obj = dxpy.DXFile("file-xxxx")
describe = file_obj.describe()
print(f"Name: {describe['name']}")
print(f"Size: {describe['size']} bytes")
print(f"State: {describe['state']}")
print(f"Created: {describe['created']}")
```
**Update file metadata**:
```python
file_obj.set_properties({"experiment": "exp1", "version": "v2"})
file_obj.add_tags(["validated", "published"])
file_obj.rename("new_name.txt")
```
## Record Operations
Records store structured metadata with arbitrary JSON.
### Creating Records
```python
# Create a record
record = dxpy.new_dxrecord(
name="sample_metadata",
types=["SampleMetadata"],
details={
"sample_id": "S001",
"tissue": "blood",
"age": 45,
"conditions": ["diabetes"]
},
project="project-xxxx",
close=True
)
```
### Reading Records
```python
record = dxpy.DXRecord("record-xxxx")
describe = record.describe()
# Access details
details = record.get_details()
sample_id = details["sample_id"]
tissue = details["tissue"]
```
### Updating Records
```python
# Record must be open to update
record = dxpy.DXRecord("record-xxxx")
details = record.get_details()
details["processed"] = True
record.set_details(details)
record.close()
```
## Search and Discovery
### Finding Data Objects
**Search by name**:
```python
results = dxpy.find_data_objects(
name="*.fastq",
project="project-xxxx",
folder="/raw_data"
)
for result in results:
print(f"{result['describe']['name']}: {result['id']}")
```
**Search by properties**:
```python
results = dxpy.find_data_objects(
classname="file",
properties={"sample": "sample1", "type": "processed"},
project="project-xxxx"
)
```
**Search by type**:
```python
# Find all records of specific type
results = dxpy.find_data_objects(
classname="record",
typename="SampleMetadata",
project="project-xxxx"
)
```
**Search with state filter**:
```python
# Find only closed files
results = dxpy.find_data_objects(
classname="file",
state="closed",
project="project-xxxx"
)
```
### System-wide Search
```python
# Search across all accessible projects
results = dxpy.find_data_objects(
name="important_data.txt",
describe=True # Include full descriptions
)
```
## Cloning and Copying
### Clone Data Between Projects
```python
# Clone file to another project
new_file = dxpy.DXFile("file-xxxx").clone(
project="project-yyyy",
folder="/imported_data"
)
```
### Clone Multiple Objects
```python
# Clone folder contents
files = dxpy.find_data_objects(
classname="file",
project="project-xxxx",
folder="/results"
)
for file in files:
file_obj = dxpy.DXFile(file['id'])
file_obj.clone(project="project-yyyy", folder="/backup")
```
## Project Management
### Creating Projects
```python
# Create a new project
project = dxpy.api.project_new({
"name": "My Analysis Project",
"description": "RNA-seq analysis for experiment X"
})
project_id = project['id']
```
### Project Permissions
```python
# Invite user to project
dxpy.api.project_invite(
project_id,
{
"invitee": "user-xxxx",
"level": "CONTRIBUTE" # VIEW, UPLOAD, CONTRIBUTE, ADMINISTER
}
)
```
### List Projects
```python
# List accessible projects
projects = dxpy.find_projects(describe=True)
for proj in projects:
desc = proj['describe']
print(f"{desc['name']}: {proj['id']}")
```
## Folder Operations
### Creating Folders
```python
# Create nested folders
dxpy.api.project_new_folder(
"project-xxxx",
{"folder": "/analysis/batch1/results", "parents": True}
)
```
### Moving Objects
```python
# Move file to different folder
file_obj = dxpy.DXFile("file-xxxx", project="project-xxxx")
file_obj.move("/new_location")
```
### Removing Objects
```python
# Remove file from project (not permanent deletion)
dxpy.api.project_remove_objects(
"project-xxxx",
{"objects": ["file-xxxx"]}
)
# Permanent deletion
file_obj = dxpy.DXFile("file-xxxx")
file_obj.remove()
```
## Archival
### Archive Data
Archived data is moved to cheaper long-term storage:
```python
# Archive a file
dxpy.api.project_archive(
"project-xxxx",
{"files": ["file-xxxx"]}
)
```
### Unarchive Data
```python
# Unarchive when needed
dxpy.api.project_unarchive(
"project-xxxx",
{"files": ["file-xxxx"]}
)
```
## Batch Operations
### Upload Multiple Files
```python
import os
# Upload all files in directory
for filename in os.listdir("./data"):
filepath = os.path.join("./data", filename)
if os.path.isfile(filepath):
dxpy.upload_local_file(
filepath,
project="project-xxxx",
folder="/batch_upload"
)
```
### Download Multiple Files
```python
# Download all files from folder
files = dxpy.find_data_objects(
classname="file",
project="project-xxxx",
folder="/results"
)
for file in files:
file_obj = dxpy.DXFile(file['id'])
filename = file_obj.describe()['name']
dxpy.download_dxfile(file['id'], f"./downloads/{filename}")
```
## Best Practices
1. **Close Files**: Always close files after writing to make them accessible
2. **Use Properties**: Tag data with meaningful properties for easier discovery
3. **Organize Folders**: Use logical folder structures
4. **Clean Up**: Remove temporary or obsolete data
5. **Batch Operations**: Group operations when processing many objects
6. **Error Handling**: Check object states before operations
7. **Permissions**: Verify project permissions before data operations
8. **Archive Old Data**: Use archival for long-term storage cost savings

View File

@@ -0,0 +1,412 @@
# DNAnexus Job Execution and Workflows
## Overview
Jobs are the fundamental execution units on DNAnexus. When an applet or app runs, a job is created and executed on a worker node in an isolated Linux environment with constant API access.
## Job Types
### Origin Jobs
Initially created by users or automated systems.
### Master Jobs
Result from directly launching an executable (app/applet).
### Child Jobs
Spawned by parent jobs for parallel processing or sub-workflows.
## Running Jobs
### Running an Applet
**Basic execution**:
```python
import dxpy
# Run an applet
job = dxpy.DXApplet("applet-xxxx").run({
"input1": {"$dnanexus_link": "file-yyyy"},
"input2": "parameter_value"
})
print(f"Job ID: {job.get_id()}")
```
**Using command line**:
```bash
dx run applet-xxxx -i input1=file-yyyy -i input2="value"
```
### Running an App
```python
# Run an app by name
job = dxpy.DXApp(name="my-app").run({
"reads": {"$dnanexus_link": "file-xxxx"},
"quality_threshold": 30
})
```
### Specifying Execution Parameters
```python
job = dxpy.DXApplet("applet-xxxx").run(
applet_input={
"input_file": {"$dnanexus_link": "file-yyyy"}
},
project="project-zzzz", # Output project
folder="/results", # Output folder
name="My Analysis Job", # Job name
instance_type="mem2_hdd2_x4", # Override instance type
priority="high" # Job priority
)
```
## Job Monitoring
### Checking Job Status
```python
job = dxpy.DXJob("job-xxxx")
state = job.describe()["state"]
# States: idle, waiting_on_input, runnable, running, done, failed, terminated
print(f"Job state: {state}")
```
**Using command line**:
```bash
dx watch job-xxxx
```
### Waiting for Job Completion
```python
# Block until job completes
job.wait_on_done()
# Check if successful
if job.describe()["state"] == "done":
output = job.describe()["output"]
print(f"Job completed: {output}")
else:
print("Job failed")
```
### Getting Job Output
```python
job = dxpy.DXJob("job-xxxx")
# Wait for completion
job.wait_on_done()
# Get outputs
output = job.describe()["output"]
output_file_id = output["result_file"]["$dnanexus_link"]
# Download result
dxpy.download_dxfile(output_file_id, "result.txt")
```
### Job Output References
Create references to job outputs before they complete:
```python
# Launch first job
job1 = dxpy.DXApplet("applet-1").run({"input": "..."})
# Launch second job using output reference
job2 = dxpy.DXApplet("applet-2").run({
"input": dxpy.dxlink(job1.get_output_ref("output_name"))
})
```
## Job Logs
### Viewing Logs
**Command line**:
```bash
dx watch job-xxxx --get-streams
```
**Programmatically**:
```python
import sys
# Get job logs
job = dxpy.DXJob("job-xxxx")
log = dxpy.api.job_get_log(job.get_id())
for log_entry in log["loglines"]:
print(log_entry)
```
## Parallel Execution
### Creating Subjobs
```python
@dxpy.entry_point('main')
def main(input_files):
# Create subjobs for parallel processing
subjobs = []
for input_file in input_files:
subjob = dxpy.new_dxjob(
fn_input={"file": input_file},
fn_name="process_file"
)
subjobs.append(subjob)
# Collect results
results = []
for subjob in subjobs:
result = subjob.get_output_ref("processed_file")
results.append(result)
return {"all_results": results}
@dxpy.entry_point('process_file')
def process_file(file):
# Process single file
# ...
return {"processed_file": output_file}
```
### Scatter-Gather Pattern
```python
# Scatter: Process items in parallel
scatter_jobs = []
for item in items:
job = dxpy.new_dxjob(
fn_input={"item": item},
fn_name="process_item"
)
scatter_jobs.append(job)
# Gather: Combine results
gather_job = dxpy.new_dxjob(
fn_input={
"results": [job.get_output_ref("result") for job in scatter_jobs]
},
fn_name="combine_results"
)
```
## Workflows
Workflows combine multiple apps/applets into multi-step pipelines.
### Creating a Workflow
```python
# Create workflow
workflow = dxpy.new_dxworkflow(
name="My Analysis Pipeline",
project="project-xxxx"
)
# Add stages
stage1 = workflow.add_stage(
dxpy.DXApplet("applet-1"),
name="Quality Control",
folder="/qc"
)
stage2 = workflow.add_stage(
dxpy.DXApplet("applet-2"),
name="Alignment",
folder="/alignment"
)
# Connect stages
stage2.set_input("reads", stage1.get_output_ref("filtered_reads"))
# Close workflow
workflow.close()
```
### Running a Workflow
```python
# Run workflow
analysis = workflow.run({
"stage-xxxx.input1": {"$dnanexus_link": "file-yyyy"}
})
# Monitor analysis (collection of jobs)
analysis.wait_on_done()
# Get workflow outputs
outputs = analysis.describe()["output"]
```
**Using command line**:
```bash
dx run workflow-xxxx -i stage-1.input=file-yyyy
```
## Job Permissions and Context
### Workspace Context
Jobs run in a workspace project with cloned input data:
- Jobs require `CONTRIBUTE` permission to workspace
- Jobs need `VIEW` access to source projects
- All charges accumulate to the originating project
### Data Requirements
Jobs cannot start until:
1. All input data objects are in `closed` state
2. Required permissions are available
3. Resources are allocated
Output objects must reach `closed` state before workspace cleanup.
## Job Lifecycle
```
Created → Waiting on Input → Runnable → Running → Done/Failed
```
**States**:
- `idle`: Job created but not yet queued
- `waiting_on_input`: Waiting for input data objects to close
- `runnable`: Ready to run, waiting for resources
- `running`: Currently executing
- `done`: Completed successfully
- `failed`: Execution failed
- `terminated`: Manually stopped
## Error Handling
### Job Failure
```python
job = dxpy.DXJob("job-xxxx")
job.wait_on_done()
desc = job.describe()
if desc["state"] == "failed":
print(f"Job failed: {desc.get('failureReason', 'Unknown')}")
print(f"Failure message: {desc.get('failureMessage', '')}")
```
### Retry Failed Jobs
```python
# Rerun failed job
new_job = dxpy.DXApplet(desc["applet"]).run(
desc["originalInput"],
project=desc["project"]
)
```
### Terminating Jobs
```python
# Stop a running job
job = dxpy.DXJob("job-xxxx")
job.terminate()
```
**Using command line**:
```bash
dx terminate job-xxxx
```
## Resource Management
### Instance Types
Specify computational resources:
```python
# Run with specific instance type
job = dxpy.DXApplet("applet-xxxx").run(
{"input": "..."},
instance_type="mem3_ssd1_v2_x8" # 8 cores, high memory, SSD
)
```
Common instance types:
- `mem1_ssd1_v2_x4` - 4 cores, standard memory
- `mem2_ssd1_v2_x8` - 8 cores, high memory
- `mem3_ssd1_v2_x16` - 16 cores, very high memory
- `mem1_ssd1_v2_x36` - 36 cores for parallel workloads
### Timeout Settings
Set maximum execution time:
```python
job = dxpy.DXApplet("applet-xxxx").run(
{"input": "..."},
timeout="24h" # Maximum runtime
)
```
## Job Tagging and Metadata
### Add Job Tags
```python
job = dxpy.DXApplet("applet-xxxx").run(
{"input": "..."},
tags=["experiment1", "batch2", "production"]
)
```
### Add Job Properties
```python
job = dxpy.DXApplet("applet-xxxx").run(
{"input": "..."},
properties={
"experiment": "exp001",
"sample": "sample1",
"batch": "batch2"
}
)
```
### Finding Jobs
```python
# Find jobs by tag
jobs = dxpy.find_jobs(
project="project-xxxx",
tags=["experiment1"],
describe=True
)
for job in jobs:
print(f"{job['describe']['name']}: {job['id']}")
```
## Best Practices
1. **Job Naming**: Use descriptive names for easier tracking
2. **Tags and Properties**: Tag jobs for organization and searchability
3. **Resource Selection**: Choose appropriate instance types for workload
4. **Error Handling**: Check job state and handle failures gracefully
5. **Parallel Processing**: Use subjobs for independent parallel tasks
6. **Workflows**: Use workflows for complex multi-step analyses
7. **Monitoring**: Monitor long-running jobs and check logs for issues
8. **Cost Management**: Use appropriate instance types to balance cost/performance
9. **Timeouts**: Set reasonable timeouts to prevent runaway jobs
10. **Cleanup**: Remove failed or obsolete jobs
## Debugging Tips
1. **Check Logs**: Always review job logs for error messages
2. **Verify Inputs**: Ensure input files are closed and accessible
3. **Test Locally**: Test logic locally before deploying to platform
4. **Start Small**: Test with small datasets before scaling up
5. **Monitor Resources**: Check if job is running out of memory or disk space
6. **Instance Type**: Try larger instance if job fails due to resources

View File

@@ -0,0 +1,523 @@
# DNAnexus Python SDK (dxpy)
## Overview
The dxpy library provides Python bindings to interact with the DNAnexus Platform. It's available both within the DNAnexus Execution Environment (for apps running on the platform) and for external scripts accessing the API.
## Installation
```bash
# Install dxpy
pip install dxpy
# Or using conda
conda install -c bioconda dxpy
```
**Requirements**: Python 3.8 or higher
## Authentication
### Login
```bash
# Login via command line
dx login
```
### API Token
```python
import dxpy
# Set authentication token
dxpy.set_security_context({
"auth_token_type": "Bearer",
"auth_token": "YOUR_API_TOKEN"
})
```
### Environment Variables
```bash
# Set token via environment
export DX_SECURITY_CONTEXT='{"auth_token": "YOUR_TOKEN", "auth_token_type": "Bearer"}'
```
## Core Classes
### DXFile
Handler for file objects.
```python
import dxpy
# Get file handler
file_obj = dxpy.DXFile("file-xxxx")
# Get file info
desc = file_obj.describe()
print(f"Name: {desc['name']}")
print(f"Size: {desc['size']} bytes")
# Download file
dxpy.download_dxfile(file_obj.get_id(), "local_file.txt")
# Read file contents
with file_obj.open_file() as f:
contents = f.read()
# Update metadata
file_obj.set_properties({"key": "value"})
file_obj.add_tags(["tag1", "tag2"])
file_obj.rename("new_name.txt")
# Close file
file_obj.close()
```
### DXRecord
Handler for record objects.
```python
# Create record
record = dxpy.new_dxrecord(
name="metadata",
types=["Metadata"],
details={"key": "value"},
project="project-xxxx",
close=True
)
# Get record handler
record = dxpy.DXRecord("record-xxxx")
# Get details
details = record.get_details()
# Update details (must be open)
record.set_details({"updated": True})
record.close()
```
### DXApplet
Handler for applet objects.
```python
# Get applet
applet = dxpy.DXApplet("applet-xxxx")
# Get applet info
desc = applet.describe()
print(f"Name: {desc['name']}")
print(f"Version: {desc.get('version', 'N/A')}")
# Run applet
job = applet.run({
"input1": {"$dnanexus_link": "file-yyyy"},
"param1": "value"
})
```
### DXApp
Handler for app objects.
```python
# Get app by name
app = dxpy.DXApp(name="my-app")
# Or by ID
app = dxpy.DXApp("app-xxxx")
# Run app
job = app.run({
"input": {"$dnanexus_link": "file-yyyy"}
})
```
### DXWorkflow
Handler for workflow objects.
```python
# Create workflow
workflow = dxpy.new_dxworkflow(
name="My Pipeline",
project="project-xxxx"
)
# Add stage
stage = workflow.add_stage(
dxpy.DXApplet("applet-xxxx"),
name="Step 1"
)
# Set stage input
stage.set_input("input1", {"$dnanexus_link": "file-yyyy"})
# Close workflow
workflow.close()
# Run workflow
analysis = workflow.run({})
```
### DXJob
Handler for job objects.
```python
# Get job
job = dxpy.DXJob("job-xxxx")
# Get job info
desc = job.describe()
print(f"State: {desc['state']}")
print(f"Name: {desc['name']}")
# Wait for completion
job.wait_on_done()
# Get output
output = desc.get("output", {})
# Terminate job
job.terminate()
```
### DXProject
Handler for project objects.
```python
# Get project
project = dxpy.DXProject("project-xxxx")
# Get project info
desc = project.describe()
print(f"Name: {desc['name']}")
print(f"Region: {desc.get('region', 'N/A')}")
# List folder contents
contents = project.list_folder("/data")
print(f"Objects: {contents['objects']}")
print(f"Folders: {contents['folders']}")
```
## High-Level Functions
### File Operations
```python
# Upload file
file_obj = dxpy.upload_local_file(
"local_file.txt",
project="project-xxxx",
folder="/data",
name="uploaded_file.txt"
)
# Download file
dxpy.download_dxfile("file-xxxx", "downloaded.txt")
# Upload string as file
file_obj = dxpy.upload_string("Hello World", project="project-xxxx")
```
### Creating Data Objects
```python
# New file
file_obj = dxpy.new_dxfile(
project="project-xxxx",
name="output.txt"
)
file_obj.write("content")
file_obj.close()
# New record
record = dxpy.new_dxrecord(
name="metadata",
details={"key": "value"},
project="project-xxxx"
)
```
### Search Functions
```python
# Find data objects
results = dxpy.find_data_objects(
classname="file",
name="*.fastq",
project="project-xxxx",
folder="/raw_data",
describe=True
)
for result in results:
print(f"{result['describe']['name']}: {result['id']}")
# Find projects
projects = dxpy.find_projects(
name="*analysis*",
describe=True
)
# Find jobs
jobs = dxpy.find_jobs(
project="project-xxxx",
created_after="2025-01-01",
state="failed"
)
# Find apps
apps = dxpy.find_apps(
category="Read Mapping"
)
```
### Links and References
```python
# Create link to data object
link = dxpy.dxlink("file-xxxx")
# Returns: {"$dnanexus_link": "file-xxxx"}
# Create link with project
link = dxpy.dxlink("file-xxxx", "project-yyyy")
# Get job output reference (for chaining jobs)
output_ref = job.get_output_ref("output_name")
```
## API Methods
### Direct API Calls
For operations not covered by high-level functions:
```python
# Call API method directly
result = dxpy.api.project_new({
"name": "New Project",
"description": "Created via API"
})
project_id = result["id"]
# File describe
file_desc = dxpy.api.file_describe("file-xxxx")
# System find data objects
results = dxpy.api.system_find_data_objects({
"class": "file",
"project": "project-xxxx",
"name": {"regexp": ".*\\.bam$"}
})
```
### Common API Methods
```python
# Project operations
dxpy.api.project_invite("project-xxxx", {"invitee": "user-yyyy", "level": "VIEW"})
dxpy.api.project_new_folder("project-xxxx", {"folder": "/new_folder"})
# File operations
dxpy.api.file_close("file-xxxx")
dxpy.api.file_remove("file-xxxx")
# Job operations
dxpy.api.job_terminate("job-xxxx")
dxpy.api.job_get_log("job-xxxx")
```
## App Development Functions
### Entry Points
```python
import dxpy
@dxpy.entry_point('main')
def main(input1, input2):
"""Main entry point for app"""
# Process inputs
result = process(input1, input2)
# Return outputs
return {
"output1": result
}
# Required at end of app code
dxpy.run()
```
### Creating Subjobs
```python
# Spawn subjob within app
subjob = dxpy.new_dxjob(
fn_input={"input": value},
fn_name="helper_function"
)
# Get output reference
output_ref = subjob.get_output_ref("result")
@dxpy.entry_point('helper_function')
def helper_function(input):
# Process
return {"result": output}
```
## Error Handling
### Exception Types
```python
import dxpy
from dxpy.exceptions import DXError, DXAPIError
try:
file_obj = dxpy.DXFile("file-xxxx")
desc = file_obj.describe()
except DXAPIError as e:
print(f"API Error: {e}")
print(f"Status Code: {e.code}")
except DXError as e:
print(f"General Error: {e}")
```
### Common Exceptions
- `DXAPIError`: API request failed
- `DXError`: General DNAnexus error
- `ResourceNotFound`: Object doesn't exist
- `PermissionDenied`: Insufficient permissions
- `InvalidInput`: Invalid input parameters
## Utility Functions
### Getting Handlers
```python
# Get handler from ID/link
handler = dxpy.get_handler("file-xxxx")
# Returns DXFile, DXRecord, etc. based on object class
# Bind handler to project
handler = dxpy.DXFile("file-xxxx", project="project-yyyy")
```
### Describe Methods
```python
# Describe any object
desc = dxpy.describe("file-xxxx")
print(desc)
# Describe with fields
desc = dxpy.describe("file-xxxx", fields={"name": True, "size": True})
```
## Configuration
### Setting Project Context
```python
# Set default project
dxpy.set_workspace_id("project-xxxx")
# Get current project
project_id = dxpy.WORKSPACE_ID
```
### Setting Region
```python
# Set API server
dxpy.set_api_server_info(host="api.dnanexus.com", port=443)
```
## Best Practices
1. **Use High-Level Functions**: Prefer `upload_local_file()` over manual file creation
2. **Handler Reuse**: Create handlers once and reuse them
3. **Batch Operations**: Use find functions to process multiple objects
4. **Error Handling**: Always wrap API calls in try-except blocks
5. **Close Objects**: Remember to close files and records after modifications
6. **Project Context**: Set workspace context for apps
7. **API Token Security**: Never hardcode tokens in source code
8. **Describe Fields**: Request only needed fields to reduce latency
9. **Search Filters**: Use specific filters to narrow search results
10. **Link Format**: Use `dxpy.dxlink()` for consistent link creation
## Common Patterns
### Upload and Process Pattern
```python
# Upload input
input_file = dxpy.upload_local_file("data.txt", project="project-xxxx")
# Run analysis
job = dxpy.DXApplet("applet-xxxx").run({
"input": dxpy.dxlink(input_file.get_id())
})
# Wait and download result
job.wait_on_done()
output_id = job.describe()["output"]["result"]["$dnanexus_link"]
dxpy.download_dxfile(output_id, "result.txt")
```
### Batch File Processing
```python
# Find all FASTQ files
files = dxpy.find_data_objects(
classname="file",
name="*.fastq",
project="project-xxxx"
)
# Process each file
jobs = []
for file_result in files:
job = dxpy.DXApplet("applet-xxxx").run({
"input": dxpy.dxlink(file_result["id"])
})
jobs.append(job)
# Wait for all jobs
for job in jobs:
job.wait_on_done()
print(f"Job {job.get_id()} completed")
```
### Workflow with Dependencies
```python
# Job 1
job1 = applet1.run({"input": data})
# Job 2 depends on job1 output
job2 = applet2.run({
"input": job1.get_output_ref("result")
})
# Job 3 depends on job2
job3 = applet3.run({
"input": job2.get_output_ref("processed")
})
# Wait for final result
job3.wait_on_done()
```

View File

@@ -0,0 +1,253 @@
---
name: labarchive-integration
description: Toolkit for interacting with LabArchives Electronic Lab Notebook (ELN) API. This skill should be used when working with LabArchives notebooks, including authentication setup, retrieving user and notebook information, backing up notebooks, managing entries and attachments, generating reports, or integrating LabArchives with other scientific tools (Protocols.io, GraphPad Prism, SnapGene, Geneious, Jupyter, REDCap). Use this skill for any task involving programmatic access to LabArchives data or automating LabArchives workflows.
---
# LabArchives Integration
## Overview
Provide comprehensive tools and workflows for interacting with the LabArchives Electronic Lab Notebook (ELN) REST API. LabArchives is a widely-used electronic lab notebook platform for research documentation, data management, and collaboration in academic and industrial laboratories.
This skill enables programmatic access to LabArchives notebooks, including user authentication, notebook operations, entry management, report generation, and third-party integrations.
## Core Capabilities
### 1. Authentication and Configuration
Set up API access credentials and regional endpoints for LabArchives API integration.
**Prerequisites:**
- Enterprise LabArchives license with API access enabled
- API access key ID and password from LabArchives administrator
- User authentication credentials (email and external applications password)
**Configuration setup:**
Use the `scripts/setup_config.py` script to create a configuration file:
```bash
python3 scripts/setup_config.py
```
This creates a `config.yaml` file with the following structure:
```yaml
api_url: https://api.labarchives.com/api # or regional endpoint
access_key_id: YOUR_ACCESS_KEY_ID
access_password: YOUR_ACCESS_PASSWORD
```
**Regional API endpoints:**
- US/International: `https://api.labarchives.com/api`
- Australia: `https://auapi.labarchives.com/api`
- UK: `https://ukapi.labarchives.com/api`
For detailed authentication instructions and troubleshooting, refer to `references/authentication_guide.md`.
### 2. User Information Retrieval
Obtain user ID (UID) and access information required for subsequent API operations.
**Workflow:**
1. Call the `users/user_access_info` API method with login credentials
2. Parse the XML/JSON response to extract the user ID (UID)
3. Use the UID to retrieve detailed user information via `users/user_info_via_id`
**Example using Python wrapper:**
```python
from labarchivespy.client import Client
# Initialize client
client = Client(api_url, access_key_id, access_password)
# Get user access info
login_params = {'login_or_email': user_email, 'password': auth_token}
response = client.make_call('users', 'user_access_info', params=login_params)
# Extract UID from response
import xml.etree.ElementTree as ET
uid = ET.fromstring(response.content)[0].text
# Get detailed user info
params = {'uid': uid}
user_info = client.make_call('users', 'user_info_via_id', params=params)
```
### 3. Notebook Operations
Manage notebook access, backup, and metadata retrieval.
**Key operations:**
- **List notebooks:** Retrieve all notebooks accessible to a user
- **Backup notebooks:** Download complete notebook data with optional attachment inclusion
- **Get notebook IDs:** Retrieve institution-defined notebook identifiers for integration with grants/project management systems
- **Get notebook members:** List all users with access to a specific notebook
- **Get notebook settings:** Retrieve configuration and permissions for notebooks
**Notebook backup example:**
Use the `scripts/notebook_operations.py` script:
```bash
# Backup with attachments (default, creates 7z archive)
python3 scripts/notebook_operations.py backup --uid USER_ID --nbid NOTEBOOK_ID
# Backup without attachments, JSON format
python3 scripts/notebook_operations.py backup --uid USER_ID --nbid NOTEBOOK_ID --json --no-attachments
```
**API endpoint format:**
```
https://<api_url>/notebooks/notebook_backup?uid=<UID>&nbid=<NOTEBOOK_ID>&json=true&no_attachments=false
```
For comprehensive API method documentation, refer to `references/api_reference.md`.
### 4. Entry and Attachment Management
Create, modify, and manage notebook entries and file attachments.
**Entry operations:**
- Create new entries in notebooks
- Add comments to existing entries
- Create entry parts/components
- Upload file attachments to entries
**Attachment workflow:**
Use the `scripts/entry_operations.py` script:
```bash
# Upload attachment to an entry
python3 scripts/entry_operations.py upload --uid USER_ID --nbid NOTEBOOK_ID --entry-id ENTRY_ID --file /path/to/file.pdf
# Create a new entry with text content
python3 scripts/entry_operations.py create --uid USER_ID --nbid NOTEBOOK_ID --title "Experiment Results" --content "Results from today's experiment..."
```
**Supported file types:**
- Documents (PDF, DOCX, TXT)
- Images (PNG, JPG, TIFF)
- Data files (CSV, XLSX, HDF5)
- Scientific formats (CIF, MOL, PDB)
- Archives (ZIP, 7Z)
### 5. Site Reports and Analytics
Generate institutional reports on notebook usage, activity, and compliance (Enterprise feature).
**Available reports:**
- Detailed Usage Report: User activity metrics and engagement statistics
- Detailed Notebook Report: Notebook metadata, member lists, and settings
- PDF/Offline Notebook Generation Report: Export tracking for compliance
- Notebook Members Report: Access control and collaboration analytics
- Notebook Settings Report: Configuration and permission auditing
**Report generation:**
```python
# Generate detailed usage report
response = client.make_call('site_reports', 'detailed_usage_report',
params={'start_date': '2025-01-01', 'end_date': '2025-10-20'})
```
### 6. Third-Party Integrations
LabArchives integrates with numerous scientific software platforms. This skill provides guidance on leveraging these integrations programmatically.
**Supported integrations:**
- **Protocols.io:** Export protocols directly to LabArchives notebooks
- **GraphPad Prism:** Export analyses and figures (Version 8+)
- **SnapGene:** Direct molecular biology workflow integration
- **Geneious:** Bioinformatics analysis export
- **Jupyter:** Embed Jupyter notebooks as entries
- **REDCap:** Clinical data capture integration
- **Qeios:** Research publishing platform
- **SciSpace:** Literature management
**OAuth authentication:**
LabArchives now uses OAuth for all new integrations. Legacy integrations may use API key authentication.
For detailed integration setup instructions and use cases, refer to `references/integrations.md`.
## Common Workflows
### Complete notebook backup workflow
1. Authenticate and obtain user ID
2. List all accessible notebooks
3. Iterate through notebooks and backup each one
4. Store backups with timestamp metadata
```bash
# Complete backup script
python3 scripts/notebook_operations.py backup-all --email user@example.edu --password AUTH_TOKEN
```
### Automated data upload workflow
1. Authenticate with LabArchives API
2. Identify target notebook and entry
3. Upload experimental data files
4. Add metadata comments to entries
5. Generate activity report
### Integration workflow example (Jupyter → LabArchives)
1. Export Jupyter notebook to HTML or PDF
2. Use entry_operations.py to upload to LabArchives
3. Add comment with execution timestamp and environment info
4. Tag entry for easy retrieval
## Python Package Installation
Install the `labarchives-py` wrapper for simplified API access:
```bash
git clone https://github.com/mcmero/labarchives-py
cd labarchives-py
pip install .
```
Alternatively, use direct HTTP requests via Python's `requests` library for custom implementations.
## Best Practices
1. **Rate limiting:** Implement appropriate delays between API calls to avoid throttling
2. **Error handling:** Always wrap API calls in try-except blocks with appropriate logging
3. **Authentication security:** Store credentials in environment variables or secure config files (never in code)
4. **Backup verification:** After notebook backup, verify file integrity and completeness
5. **Incremental operations:** For large notebooks, use pagination and batch processing
6. **Regional endpoints:** Use the correct regional API endpoint for optimal performance
## Troubleshooting
**Common issues:**
- **401 Unauthorized:** Verify access key ID and password are correct; check API access is enabled for your account
- **404 Not Found:** Confirm notebook ID (nbid) exists and user has access permissions
- **403 Forbidden:** Check user permissions for the requested operation
- **Empty response:** Ensure required parameters (uid, nbid) are provided correctly
- **Attachment upload failures:** Verify file size limits and format compatibility
For additional support, contact LabArchives at support@labarchives.com.
## Resources
This skill includes bundled resources to support LabArchives API integration:
### scripts/
- `setup_config.py`: Interactive configuration file generator for API credentials
- `notebook_operations.py`: Utilities for listing, backing up, and managing notebooks
- `entry_operations.py`: Tools for creating entries and uploading attachments
### references/
- `api_reference.md`: Comprehensive API endpoint documentation with parameters and examples
- `authentication_guide.md`: Detailed authentication setup and configuration instructions
- `integrations.md`: Third-party integration setup guides and use cases

View File

@@ -0,0 +1,342 @@
# LabArchives API Reference
## API Structure
All LabArchives API calls follow this URL pattern:
```
https://<base_url>/api/<api_class>/<api_method>?<authentication_parameters>&<method_parameters>
```
## Regional API Endpoints
| Region | Base URL |
|--------|----------|
| US/International | `https://api.labarchives.com/api` |
| Australia | `https://auapi.labarchives.com/api` |
| UK | `https://ukapi.labarchives.com/api` |
## Authentication
All API calls require authentication parameters:
- `access_key_id`: Provided by LabArchives administrator
- `access_password`: Provided by LabArchives administrator
- Additional user-specific credentials may be required for certain operations
## API Classes and Methods
### Users API Class
#### `users/user_access_info`
Retrieve user ID and notebook access information.
**Parameters:**
- `login_or_email` (required): User's email address or login username
- `password` (required): User's external applications password (not regular login password)
**Returns:** XML or JSON response containing:
- User ID (uid)
- List of accessible notebooks with IDs (nbid)
- Account status and permissions
**Example:**
```python
params = {
'login_or_email': 'researcher@university.edu',
'password': 'external_app_password'
}
response = client.make_call('users', 'user_access_info', params=params)
```
#### `users/user_info_via_id`
Retrieve detailed user information by user ID.
**Parameters:**
- `uid` (required): User ID obtained from user_access_info
**Returns:** User profile information including:
- Name and email
- Account creation date
- Institution affiliation
- Role and permissions
- Storage quota and usage
**Example:**
```python
params = {'uid': '12345'}
response = client.make_call('users', 'user_info_via_id', params=params)
```
### Notebooks API Class
#### `notebooks/notebook_backup`
Download complete notebook data including entries, attachments, and metadata.
**Parameters:**
- `uid` (required): User ID
- `nbid` (required): Notebook ID
- `json` (optional, default: false): Return data in JSON format instead of XML
- `no_attachments` (optional, default: false): Exclude attachments from backup
**Returns:**
- When `no_attachments=false`: 7z compressed archive containing all notebook data
- When `no_attachments=true`: XML or JSON structured data with entry content
**File format:**
The returned archive includes:
- Entry text content in HTML format
- File attachments in original formats
- Metadata XML files with timestamps, authors, and version history
- Comment threads and annotations
**Example:**
```python
# Full backup with attachments
params = {
'uid': '12345',
'nbid': '67890',
'json': 'false',
'no_attachments': 'false'
}
response = client.make_call('notebooks', 'notebook_backup', params=params)
# Write to file
with open('notebook_backup.7z', 'wb') as f:
f.write(response.content)
```
```python
# Metadata only backup (JSON format, no attachments)
params = {
'uid': '12345',
'nbid': '67890',
'json': 'true',
'no_attachments': 'true'
}
response = client.make_call('notebooks', 'notebook_backup', params=params)
import json
notebook_data = json.loads(response.content)
```
#### `notebooks/list_notebooks`
Retrieve all notebooks accessible to a user (method name may vary by API version).
**Parameters:**
- `uid` (required): User ID
**Returns:** List of notebooks with:
- Notebook ID (nbid)
- Notebook name
- Creation and modification dates
- Access level (owner, editor, viewer)
- Member count
### Entries API Class
#### `entries/create_entry`
Create a new entry in a notebook.
**Parameters:**
- `uid` (required): User ID
- `nbid` (required): Notebook ID
- `title` (required): Entry title
- `content` (optional): HTML-formatted entry content
- `date` (optional): Entry date (defaults to current date)
**Returns:** Entry ID and creation confirmation
**Example:**
```python
params = {
'uid': '12345',
'nbid': '67890',
'title': 'Experiment 2025-10-20',
'content': '<p>Conducted PCR amplification of target gene...</p>',
'date': '2025-10-20'
}
response = client.make_call('entries', 'create_entry', params=params)
```
#### `entries/create_comment`
Add a comment to an existing entry.
**Parameters:**
- `uid` (required): User ID
- `nbid` (required): Notebook ID
- `entry_id` (required): Target entry ID
- `comment` (required): Comment text (HTML supported)
**Returns:** Comment ID and timestamp
#### `entries/create_part`
Add a component/part to an entry (e.g., text section, table, image).
**Parameters:**
- `uid` (required): User ID
- `nbid` (required): Notebook ID
- `entry_id` (required): Target entry ID
- `part_type` (required): Type of part (text, table, image, etc.)
- `content` (required): Part content in appropriate format
**Returns:** Part ID and creation confirmation
#### `entries/upload_attachment`
Upload a file attachment to an entry.
**Parameters:**
- `uid` (required): User ID
- `nbid` (required): Notebook ID
- `entry_id` (required): Target entry ID
- `file` (required): File data (multipart/form-data)
- `filename` (required): Original filename
**Returns:** Attachment ID and upload confirmation
**Example using requests library:**
```python
import requests
url = f'{api_url}/entries/upload_attachment'
files = {'file': open('/path/to/data.csv', 'rb')}
params = {
'uid': '12345',
'nbid': '67890',
'entry_id': '11111',
'filename': 'data.csv',
'access_key_id': access_key_id,
'access_password': access_password
}
response = requests.post(url, files=files, data=params)
```
### Site Reports API Class
Enterprise-only features for institutional reporting and analytics.
#### `site_reports/detailed_usage_report`
Generate comprehensive usage statistics for the institution.
**Parameters:**
- `start_date` (required): Report start date (YYYY-MM-DD)
- `end_date` (required): Report end date (YYYY-MM-DD)
- `format` (optional): Output format (csv, json, xml)
**Returns:** Usage metrics including:
- User login frequency
- Entry creation counts
- Storage utilization
- Collaboration statistics
- Time-based activity patterns
#### `site_reports/detailed_notebook_report`
Generate detailed report on all notebooks in the institution.
**Parameters:**
- `include_settings` (optional, default: false): Include notebook settings
- `include_members` (optional, default: false): Include member lists
**Returns:** Notebook inventory with:
- Notebook names and IDs
- Owner information
- Creation and last modified dates
- Member count and access levels
- Storage size
- Settings (if requested)
#### `site_reports/pdf_offline_generation_report`
Track PDF exports for compliance and auditing purposes.
**Parameters:**
- `start_date` (required): Report start date
- `end_date` (required): Report end date
**Returns:** Export activity log with:
- User who generated PDF
- Notebook and entry exported
- Export timestamp
- IP address
### Utilities API Class
#### `utilities/institutional_login_urls`
Retrieve institutional login URLs for SSO integration.
**Parameters:** None required (uses access key authentication)
**Returns:** List of institutional login endpoints
## Response Formats
### XML Response Example
```xml
<?xml version="1.0" encoding="UTF-8"?>
<response>
<uid>12345</uid>
<email>researcher@university.edu</email>
<notebooks>
<notebook>
<nbid>67890</nbid>
<name>Lab Notebook 2025</name>
<role>owner</role>
</notebook>
</notebooks>
</response>
```
### JSON Response Example
```json
{
"uid": "12345",
"email": "researcher@university.edu",
"notebooks": [
{
"nbid": "67890",
"name": "Lab Notebook 2025",
"role": "owner"
}
]
}
```
## Error Codes
| Code | Message | Meaning | Solution |
|------|---------|---------|----------|
| 401 | Unauthorized | Invalid credentials | Verify access_key_id and access_password |
| 403 | Forbidden | Insufficient permissions | Check user role and notebook access |
| 404 | Not Found | Resource doesn't exist | Verify uid, nbid, or entry_id are correct |
| 429 | Too Many Requests | Rate limit exceeded | Implement exponential backoff |
| 500 | Internal Server Error | Server-side issue | Retry request or contact support |
## Rate Limiting
LabArchives implements rate limiting to ensure service stability:
- **Recommended:** Maximum 60 requests per minute per API key
- **Burst allowance:** Short bursts up to 100 requests may be tolerated
- **Best practice:** Implement 1-2 second delays between requests for batch operations
## API Versioning
LabArchives API is backward compatible. New methods are added without breaking existing implementations. Monitor LabArchives announcements for new capabilities.
## Support and Documentation
For API access requests, technical questions, or feature requests:
- Email: support@labarchives.com
- Include your institution name and specific use case for faster assistance

View File

@@ -0,0 +1,357 @@
# LabArchives Authentication Guide
## Prerequisites
### 1. Enterprise License
API access requires an Enterprise LabArchives license. Contact your LabArchives administrator or sales@labarchives.com to:
- Verify your institution has Enterprise access
- Request API access enablement for your account
- Obtain institutional API credentials
### 2. API Credentials
You need two sets of credentials:
#### Institutional API Credentials (from LabArchives administrator)
- **Access Key ID**: Institution-level identifier
- **Access Password**: Institution-level secret
#### User Authentication Credentials (self-configured)
- **Email**: Your LabArchives account email (e.g., researcher@university.edu)
- **External Applications Password**: Set in your LabArchives account settings
## Setting Up External Applications Password
The external applications password is different from your regular LabArchives login password. It provides API access without exposing your primary credentials.
**Steps to create external applications password:**
1. Log into your LabArchives account at mynotebook.labarchives.com (or your institutional URL)
2. Navigate to **Account Settings** (click your name in top-right corner)
3. Select **Security & Privacy** tab
4. Find **External Applications** section
5. Click **Generate New Password** or **Reset Password**
6. Copy and securely store this password (you won't see it again)
7. Use this password for all API authentication
**Security note:** Treat this password like an API token. If compromised, regenerate it immediately from account settings.
## Configuration File Setup
Create a `config.yaml` file to store your credentials securely:
```yaml
# Regional API endpoint
api_url: https://api.labarchives.com/api
# Institutional credentials (from administrator)
access_key_id: YOUR_ACCESS_KEY_ID_HERE
access_password: YOUR_ACCESS_PASSWORD_HERE
# User credentials (for user-specific operations)
user_email: researcher@university.edu
user_external_password: YOUR_EXTERNAL_APP_PASSWORD_HERE
```
**Alternative: Environment variables**
For enhanced security, use environment variables instead of config file:
```bash
export LABARCHIVES_API_URL="https://api.labarchives.com/api"
export LABARCHIVES_ACCESS_KEY_ID="your_key_id"
export LABARCHIVES_ACCESS_PASSWORD="your_access_password"
export LABARCHIVES_USER_EMAIL="researcher@university.edu"
export LABARCHIVES_USER_PASSWORD="your_external_app_password"
```
## Regional Endpoints
Select the correct regional API endpoint for your institution:
| Region | Endpoint | Use if your LabArchives URL is |
|--------|----------|--------------------------------|
| US/International | `https://api.labarchives.com/api` | `mynotebook.labarchives.com` |
| Australia | `https://auapi.labarchives.com/api` | `aunotebook.labarchives.com` |
| UK | `https://ukapi.labarchives.com/api` | `uknotebook.labarchives.com` |
Using the wrong regional endpoint will result in authentication failures even with correct credentials.
## Authentication Flow
### Option 1: Using labarchives-py Python Wrapper
```python
from labarchivespy.client import Client
import yaml
# Load configuration
with open('config.yaml', 'r') as f:
config = yaml.safe_load(f)
# Initialize client with institutional credentials
client = Client(
config['api_url'],
config['access_key_id'],
config['access_password']
)
# Authenticate as specific user to get UID
login_params = {
'login_or_email': config['user_email'],
'password': config['user_external_password']
}
response = client.make_call('users', 'user_access_info', params=login_params)
# Parse response to extract UID
import xml.etree.ElementTree as ET
uid = ET.fromstring(response.content)[0].text
print(f"Authenticated as user ID: {uid}")
```
### Option 2: Direct HTTP Requests with Python requests
```python
import requests
import yaml
# Load configuration
with open('config.yaml', 'r') as f:
config = yaml.safe_load(f)
# Construct API call
url = f"{config['api_url']}/users/user_access_info"
params = {
'access_key_id': config['access_key_id'],
'access_password': config['access_password'],
'login_or_email': config['user_email'],
'password': config['user_external_password']
}
# Make authenticated request
response = requests.get(url, params=params)
if response.status_code == 200:
print("Authentication successful!")
print(response.content.decode('utf-8'))
else:
print(f"Authentication failed: {response.status_code}")
print(response.content.decode('utf-8'))
```
### Option 3: Using R
```r
library(httr)
library(xml2)
# Configuration
api_url <- "https://api.labarchives.com/api"
access_key_id <- "YOUR_ACCESS_KEY_ID"
access_password <- "YOUR_ACCESS_PASSWORD"
user_email <- "researcher@university.edu"
user_external_password <- "YOUR_EXTERNAL_APP_PASSWORD"
# Make authenticated request
response <- GET(
paste0(api_url, "/users/user_access_info"),
query = list(
access_key_id = access_key_id,
access_password = access_password,
login_or_email = user_email,
password = user_external_password
)
)
# Parse response
if (status_code(response) == 200) {
content <- content(response, as = "text", encoding = "UTF-8")
xml_data <- read_xml(content)
uid <- xml_text(xml_find_first(xml_data, "//uid"))
print(paste("Authenticated as user ID:", uid))
} else {
print(paste("Authentication failed:", status_code(response)))
}
```
## OAuth Authentication (New Integrations)
LabArchives now uses OAuth 2.0 for new third-party integrations. Legacy API key authentication (described above) continues to work for direct API access.
**OAuth flow (for app developers):**
1. Register your application with LabArchives
2. Obtain client ID and client secret
3. Implement OAuth 2.0 authorization code flow
4. Exchange authorization code for access token
5. Use access token for API requests
Contact LabArchives developer support for OAuth integration documentation.
## Troubleshooting Authentication Issues
### 401 Unauthorized Error
**Possible causes and solutions:**
1. **Incorrect access_key_id or access_password**
- Verify credentials with your LabArchives administrator
- Check for typos or extra whitespace in config file
2. **Wrong external applications password**
- Confirm you're using the external applications password, not your regular login password
- Regenerate external applications password in account settings
3. **API access not enabled**
- Contact your LabArchives administrator to enable API access for your account
- Verify your institution has Enterprise license
4. **Wrong regional endpoint**
- Confirm your api_url matches your institution's LabArchives instance
- Check if you're using .com, .auapi, or .ukapi domain
### 403 Forbidden Error
**Possible causes and solutions:**
1. **Insufficient permissions**
- Verify your account role has necessary permissions
- Check if you have access to the specific notebook (nbid)
2. **Account suspended or expired**
- Contact your LabArchives administrator to check account status
### Network and Connection Issues
**Firewall/proxy configuration:**
If your institution uses a firewall or proxy:
```python
import requests
# Configure proxy
proxies = {
'http': 'http://proxy.university.edu:8080',
'https': 'http://proxy.university.edu:8080'
}
# Make request with proxy
response = requests.get(url, params=params, proxies=proxies)
```
**SSL certificate verification:**
For self-signed certificates (not recommended for production):
```python
# Disable SSL verification (use only for testing)
response = requests.get(url, params=params, verify=False)
```
## Security Best Practices
1. **Never commit credentials to version control**
- Add `config.yaml` to `.gitignore`
- Use environment variables or secret management systems
2. **Rotate credentials regularly**
- Change external applications password every 90 days
- Regenerate API keys annually
3. **Use least privilege principle**
- Request only necessary API permissions
- Create separate API credentials for different applications
4. **Monitor API usage**
- Regularly review API access logs
- Set up alerts for unusual activity
5. **Secure storage**
- Encrypt configuration files at rest
- Use system keychain or secret management tools (e.g., AWS Secrets Manager, Azure Key Vault)
## Testing Authentication
Use this script to verify your authentication setup:
```python
#!/usr/bin/env python3
"""Test LabArchives API authentication"""
from labarchivespy.client import Client
import yaml
import sys
def test_authentication():
try:
# Load config
with open('config.yaml', 'r') as f:
config = yaml.safe_load(f)
print("Configuration loaded successfully")
print(f"API URL: {config['api_url']}")
# Initialize client
client = Client(
config['api_url'],
config['access_key_id'],
config['access_password']
)
print("Client initialized")
# Test authentication
login_params = {
'login_or_email': config['user_email'],
'password': config['user_external_password']
}
response = client.make_call('users', 'user_access_info', params=login_params)
if response.status_code == 200:
print("✅ Authentication successful!")
# Extract UID
import xml.etree.ElementTree as ET
uid = ET.fromstring(response.content)[0].text
print(f"User ID: {uid}")
# Get user info
user_response = client.make_call('users', 'user_info_via_id', params={'uid': uid})
print("✅ User information retrieved successfully")
return True
else:
print(f"❌ Authentication failed: {response.status_code}")
print(response.content.decode('utf-8'))
return False
except Exception as e:
print(f"❌ Error: {str(e)}")
import traceback
traceback.print_exc()
return False
if __name__ == '__main__':
success = test_authentication()
sys.exit(0 if success else 1)
```
Run this script to confirm everything is configured correctly:
```bash
python3 test_auth.py
```
## Getting Help
If authentication continues to fail after troubleshooting:
1. Contact your institutional LabArchives administrator
2. Email LabArchives support: support@labarchives.com
3. Include:
- Your institution name
- Your LabArchives account email
- Error messages and response codes
- Regional endpoint you're using
- Programming language and library versions

View File

@@ -0,0 +1,425 @@
# LabArchives Third-Party Integrations
## Overview
LabArchives integrates with numerous scientific software platforms to streamline research workflows. This document covers programmatic integration approaches, automation strategies, and best practices for each supported platform.
## Integration Categories
### 1. Protocol Management
#### Protocols.io Integration
Export protocols directly from Protocols.io to LabArchives notebooks.
**Use cases:**
- Standardize experimental procedures across lab notebooks
- Maintain version control for protocols
- Link protocols to experimental results
**Setup:**
1. Enable Protocols.io integration in LabArchives settings
2. Authenticate with Protocols.io account
3. Browse and select protocols to export
**Programmatic approach:**
```python
# Export Protocols.io protocol as HTML/PDF
# Then upload to LabArchives via API
def import_protocol_to_labarchives(client, uid, nbid, protocol_id):
"""Import Protocols.io protocol to LabArchives entry"""
# 1. Fetch protocol from Protocols.io API
protocol_data = fetch_protocol_from_protocolsio(protocol_id)
# 2. Create new entry in LabArchives
entry_params = {
'uid': uid,
'nbid': nbid,
'title': f"Protocol: {protocol_data['title']}",
'content': protocol_data['html_content']
}
response = client.make_call('entries', 'create_entry', params=entry_params)
# 3. Add protocol metadata as comment
entry_id = extract_entry_id(response)
comment_params = {
'uid': uid,
'nbid': nbid,
'entry_id': entry_id,
'comment': f"Protocols.io ID: {protocol_id}<br>Version: {protocol_data['version']}"
}
client.make_call('entries', 'create_comment', params=comment_params)
return entry_id
```
**Updated:** September 22, 2025
### 2. Data Analysis Tools
#### GraphPad Prism Integration (Version 8+)
Export analyses, graphs, and figures directly from Prism to LabArchives.
**Use cases:**
- Archive statistical analyses with raw data
- Document figure generation for publications
- Maintain analysis audit trail for compliance
**Setup:**
1. Install GraphPad Prism 8 or higher
2. Configure LabArchives connection in Prism preferences
3. Use "Export to LabArchives" option from File menu
**Programmatic approach:**
```python
# Upload Prism files to LabArchives via API
def upload_prism_analysis(client, uid, nbid, entry_id, prism_file_path):
"""Upload GraphPad Prism file to LabArchives entry"""
import requests
url = f'{client.api_url}/entries/upload_attachment'
files = {'file': open(prism_file_path, 'rb')}
params = {
'uid': uid,
'nbid': nbid,
'entry_id': entry_id,
'filename': os.path.basename(prism_file_path),
'access_key_id': client.access_key_id,
'access_password': client.access_password
}
response = requests.post(url, files=files, data=params)
return response
```
**Supported file types:**
- .pzfx (Prism project files)
- .png, .jpg, .pdf (exported graphs)
- .xlsx (exported data tables)
**Updated:** September 8, 2025
### 3. Molecular Biology & Bioinformatics
#### SnapGene Integration
Direct integration for molecular biology workflows, plasmid maps, and sequence analysis.
**Use cases:**
- Document cloning strategies
- Archive plasmid maps with experimental records
- Link sequences to experimental results
**Setup:**
1. Install SnapGene software
2. Enable LabArchives export in SnapGene preferences
3. Use "Send to LabArchives" feature
**File format support:**
- .dna (SnapGene files)
- .gb, .gbk (GenBank format)
- .fasta (sequence files)
- .png, .pdf (plasmid map exports)
**Programmatic workflow:**
```python
def upload_snapgene_file(client, uid, nbid, entry_id, snapgene_file):
"""Upload SnapGene file with preview image"""
# Upload main SnapGene file
upload_attachment(client, uid, nbid, entry_id, snapgene_file)
# Generate and upload preview image (requires SnapGene CLI)
preview_png = generate_snapgene_preview(snapgene_file)
upload_attachment(client, uid, nbid, entry_id, preview_png)
```
#### Geneious Integration
Bioinformatics analysis export from Geneious to LabArchives.
**Use cases:**
- Archive sequence alignments and phylogenetic trees
- Document NGS analysis pipelines
- Link bioinformatics workflows to wet-lab experiments
**Supported exports:**
- Sequence alignments
- Phylogenetic trees
- Assembly reports
- Variant calling results
**File formats:**
- .geneious (Geneious documents)
- .fasta, .fastq (sequence data)
- .bam, .sam (alignment files)
- .vcf (variant files)
### 4. Computational Notebooks
#### Jupyter Integration
Embed Jupyter notebooks as LabArchives entries for reproducible computational research.
**Use cases:**
- Document data analysis workflows
- Archive computational experiments
- Link code, results, and narrative
**Workflow:**
```python
def export_jupyter_to_labarchives(notebook_path, client, uid, nbid):
"""Export Jupyter notebook to LabArchives"""
import nbformat
from nbconvert import HTMLExporter
# Load notebook
with open(notebook_path, 'r') as f:
nb = nbformat.read(f, as_version=4)
# Convert to HTML
html_exporter = HTMLExporter()
html_exporter.template_name = 'classic'
(body, resources) = html_exporter.from_notebook_node(nb)
# Create entry in LabArchives
entry_params = {
'uid': uid,
'nbid': nbid,
'title': f"Jupyter Notebook: {os.path.basename(notebook_path)}",
'content': body
}
response = client.make_call('entries', 'create_entry', params=entry_params)
# Upload original .ipynb file as attachment
entry_id = extract_entry_id(response)
upload_attachment(client, uid, nbid, entry_id, notebook_path)
return entry_id
```
**Best practices:**
- Export with outputs included (Run All Cells before export)
- Include environment.yml or requirements.txt as attachment
- Add execution timestamp and system info in comments
### 5. Clinical Research
#### REDCap Integration
Clinical data capture integration with LabArchives for research compliance and audit trails.
**Use cases:**
- Link clinical data collection to research notebooks
- Maintain audit trails for regulatory compliance
- Document clinical trial protocols and amendments
**Integration approach:**
- REDCap API exports data to LabArchives entries
- Automated data synchronization for longitudinal studies
- HIPAA-compliant data handling
**Example workflow:**
```python
def sync_redcap_to_labarchives(redcap_api_token, client, uid, nbid):
"""Sync REDCap data to LabArchives"""
# Fetch REDCap data
redcap_data = fetch_redcap_data(redcap_api_token)
# Create LabArchives entry
entry_params = {
'uid': uid,
'nbid': nbid,
'title': f"REDCap Data Export {datetime.now().strftime('%Y-%m-%d')}",
'content': format_redcap_data_html(redcap_data)
}
response = client.make_call('entries', 'create_entry', params=entry_params)
return response
```
**Compliance features:**
- 21 CFR Part 11 compliance
- Audit trail maintenance
- Data integrity verification
### 6. Research Publishing
#### Qeios Integration
Research publishing platform integration for preprints and peer review.
**Use cases:**
- Export research findings to preprint servers
- Document publication workflows
- Link published articles to lab notebooks
**Workflow:**
- Export formatted entries from LabArchives
- Submit to Qeios platform
- Maintain bidirectional links between notebook and publication
#### SciSpace Integration
Literature management and citation integration.
**Use cases:**
- Link references to experimental procedures
- Maintain literature review in notebooks
- Generate bibliographies for reports
**Features:**
- Citation import from SciSpace to LabArchives
- PDF annotation synchronization
- Reference management
## OAuth Authentication for Integrations
LabArchives now uses OAuth 2.0 for new third-party integrations.
**OAuth flow for app developers:**
```python
def labarchives_oauth_flow(client_id, client_secret, redirect_uri):
"""Implement OAuth 2.0 flow for LabArchives integration"""
import requests
# Step 1: Get authorization code
auth_url = "https://mynotebook.labarchives.com/oauth/authorize"
auth_params = {
'client_id': client_id,
'redirect_uri': redirect_uri,
'response_type': 'code',
'scope': 'read write'
}
# User visits auth_url and grants permission
# Step 2: Exchange code for access token
token_url = "https://mynotebook.labarchives.com/oauth/token"
token_params = {
'client_id': client_id,
'client_secret': client_secret,
'redirect_uri': redirect_uri,
'grant_type': 'authorization_code',
'code': authorization_code # From redirect
}
response = requests.post(token_url, data=token_params)
tokens = response.json()
return tokens['access_token'], tokens['refresh_token']
```
**OAuth advantages:**
- More secure than API keys
- Fine-grained permission control
- Token refresh for long-running integrations
- Revocable access
## Custom Integration Development
### General Workflow
For tools not officially supported, develop custom integrations:
1. **Export data** from source application (API or file export)
2. **Transform format** to HTML or supported file type
3. **Authenticate** with LabArchives API
4. **Create entry** or upload attachment
5. **Add metadata** via comments for traceability
### Example: Custom Integration Template
```python
class LabArchivesIntegration:
"""Template for custom LabArchives integrations"""
def __init__(self, config_path):
self.client = self._init_client(config_path)
self.uid = self._authenticate()
def _init_client(self, config_path):
"""Initialize LabArchives client"""
with open(config_path) as f:
config = yaml.safe_load(f)
return Client(config['api_url'],
config['access_key_id'],
config['access_password'])
def _authenticate(self):
"""Get user ID"""
# Implementation from authentication_guide.md
pass
def export_data(self, source_data, nbid, title):
"""Export data to LabArchives"""
# Transform data to HTML
html_content = self._transform_to_html(source_data)
# Create entry
params = {
'uid': self.uid,
'nbid': nbid,
'title': title,
'content': html_content
}
response = self.client.make_call('entries', 'create_entry', params=params)
return extract_entry_id(response)
def _transform_to_html(self, data):
"""Transform data to HTML format"""
# Custom transformation logic
pass
```
## Integration Best Practices
1. **Version control:** Track which software version generated the data
2. **Metadata preservation:** Include timestamps, user info, and processing parameters
3. **File format standards:** Use open formats when possible (CSV, JSON, HTML)
4. **Batch operations:** Implement rate limiting for bulk uploads
5. **Error handling:** Implement retry logic with exponential backoff
6. **Audit trails:** Log all API operations for compliance
7. **Testing:** Validate integrations in test notebooks before production use
## Troubleshooting Integrations
### Common Issues
**Integration not appearing in LabArchives:**
- Verify integration is enabled by administrator
- Check OAuth permissions if using OAuth
- Ensure compatible software version
**File upload failures:**
- Verify file size limits (typically 2GB per file)
- Check file format compatibility
- Ensure sufficient storage quota
**Authentication errors:**
- Verify API credentials are current
- Check if integration-specific tokens have expired
- Confirm user has necessary permissions
### Integration Support
For integration-specific issues:
- Check software vendor documentation (e.g., GraphPad, Protocols.io)
- Contact LabArchives support: support@labarchives.com
- Review LabArchives knowledge base: help.labarchives.com
## Future Integration Opportunities
Potential integrations for custom development:
- Electronic data capture (EDC) systems
- Laboratory information management systems (LIMS)
- Instrument data systems (chromatography, spectroscopy)
- Cloud storage platforms (Box, Dropbox, Google Drive)
- Project management tools (Asana, Monday.com)
- Grant management systems
For custom integration development, contact LabArchives for API partnership opportunities.

View File

@@ -0,0 +1,334 @@
#!/usr/bin/env python3
"""
LabArchives Entry Operations
Utilities for creating entries, uploading attachments, and managing notebook content.
"""
import argparse
import sys
import yaml
import os
from pathlib import Path
from datetime import datetime
def load_config(config_path='config.yaml'):
"""Load configuration from YAML file"""
try:
with open(config_path, 'r') as f:
return yaml.safe_load(f)
except FileNotFoundError:
print(f"❌ Configuration file not found: {config_path}")
print(" Run setup_config.py first to create configuration")
sys.exit(1)
except Exception as e:
print(f"❌ Error loading configuration: {e}")
sys.exit(1)
def init_client(config):
"""Initialize LabArchives API client"""
try:
from labarchivespy.client import Client
return Client(
config['api_url'],
config['access_key_id'],
config['access_password']
)
except ImportError:
print("❌ labarchives-py package not installed")
print(" Install with: pip install git+https://github.com/mcmero/labarchives-py")
sys.exit(1)
def get_user_id(client, config):
"""Get user ID via authentication"""
import xml.etree.ElementTree as ET
login_params = {
'login_or_email': config['user_email'],
'password': config['user_external_password']
}
try:
response = client.make_call('users', 'user_access_info', params=login_params)
if response.status_code == 200:
uid = ET.fromstring(response.content)[0].text
return uid
else:
print(f"❌ Authentication failed: HTTP {response.status_code}")
print(f" Response: {response.content.decode('utf-8')[:200]}")
sys.exit(1)
except Exception as e:
print(f"❌ Error during authentication: {e}")
sys.exit(1)
def create_entry(client, uid, nbid, title, content=None, date=None):
"""Create a new entry in a notebook"""
print(f"\n📝 Creating entry: {title}")
# Prepare parameters
params = {
'uid': uid,
'nbid': nbid,
'title': title
}
if content:
# Ensure content is HTML formatted
if not content.startswith('<'):
content = f'<p>{content}</p>'
params['content'] = content
if date:
params['date'] = date
try:
response = client.make_call('entries', 'create_entry', params=params)
if response.status_code == 200:
print("✅ Entry created successfully")
# Try to extract entry ID from response
try:
import xml.etree.ElementTree as ET
root = ET.fromstring(response.content)
entry_id = root.find('.//entry_id')
if entry_id is not None:
print(f" Entry ID: {entry_id.text}")
return entry_id.text
except:
pass
return True
else:
print(f"❌ Entry creation failed: HTTP {response.status_code}")
print(f" Response: {response.content.decode('utf-8')[:200]}")
return None
except Exception as e:
print(f"❌ Error creating entry: {e}")
return None
def create_comment(client, uid, nbid, entry_id, comment):
"""Add a comment to an existing entry"""
print(f"\n💬 Adding comment to entry {entry_id}")
params = {
'uid': uid,
'nbid': nbid,
'entry_id': entry_id,
'comment': comment
}
try:
response = client.make_call('entries', 'create_comment', params=params)
if response.status_code == 200:
print("✅ Comment added successfully")
return True
else:
print(f"❌ Comment creation failed: HTTP {response.status_code}")
return False
except Exception as e:
print(f"❌ Error creating comment: {e}")
return False
def upload_attachment(client, config, uid, nbid, entry_id, file_path):
"""Upload a file attachment to an entry"""
import requests
file_path = Path(file_path)
if not file_path.exists():
print(f"❌ File not found: {file_path}")
return False
print(f"\n📎 Uploading attachment: {file_path.name}")
print(f" Size: {file_path.stat().st_size / 1024:.2f} KB")
url = f"{config['api_url']}/entries/upload_attachment"
try:
with open(file_path, 'rb') as f:
files = {'file': f}
data = {
'uid': uid,
'nbid': nbid,
'entry_id': entry_id,
'filename': file_path.name,
'access_key_id': config['access_key_id'],
'access_password': config['access_password']
}
response = requests.post(url, files=files, data=data)
if response.status_code == 200:
print("✅ Attachment uploaded successfully")
return True
else:
print(f"❌ Upload failed: HTTP {response.status_code}")
print(f" Response: {response.content.decode('utf-8')[:200]}")
return False
except Exception as e:
print(f"❌ Error uploading attachment: {e}")
return False
def batch_upload(client, config, uid, nbid, entry_id, directory):
"""Upload all files from a directory as attachments"""
directory = Path(directory)
if not directory.is_dir():
print(f"❌ Directory not found: {directory}")
return
files = list(directory.glob('*'))
files = [f for f in files if f.is_file()]
if not files:
print(f"❌ No files found in {directory}")
return
print(f"\n📦 Batch uploading {len(files)} files from {directory}")
successful = 0
failed = 0
for file_path in files:
if upload_attachment(client, config, uid, nbid, entry_id, file_path):
successful += 1
else:
failed += 1
print("\n" + "="*60)
print(f"Batch upload complete: {successful} successful, {failed} failed")
print("="*60)
def create_entry_with_attachments(client, config, uid, nbid, title, content,
attachments):
"""Create entry and upload multiple attachments"""
# Create entry
entry_id = create_entry(client, uid, nbid, title, content)
if not entry_id:
print("❌ Cannot upload attachments without entry ID")
return False
# Upload attachments
for attachment_path in attachments:
upload_attachment(client, config, uid, nbid, entry_id, attachment_path)
return True
def main():
"""Main command-line interface"""
parser = argparse.ArgumentParser(
description='LabArchives Entry Operations',
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
Examples:
# Create simple entry
python3 entry_operations.py create --nbid 12345 --title "Experiment Results"
# Create entry with content
python3 entry_operations.py create --nbid 12345 --title "Results" \\
--content "PCR amplification successful"
# Create entry with HTML content
python3 entry_operations.py create --nbid 12345 --title "Results" \\
--content "<p>Results:</p><ul><li>Sample A: Positive</li></ul>"
# Upload attachment to existing entry
python3 entry_operations.py upload --nbid 12345 --entry-id 67890 \\
--file data.csv
# Batch upload multiple files
python3 entry_operations.py batch-upload --nbid 12345 --entry-id 67890 \\
--directory ./experiment_data/
# Add comment to entry
python3 entry_operations.py comment --nbid 12345 --entry-id 67890 \\
--text "Follow-up analysis needed"
"""
)
parser.add_argument('--config', default='config.yaml',
help='Path to configuration file (default: config.yaml)')
parser.add_argument('--nbid', required=True,
help='Notebook ID')
subparsers = parser.add_subparsers(dest='command', help='Command to execute')
# Create entry command
create_parser = subparsers.add_parser('create', help='Create new entry')
create_parser.add_argument('--title', required=True, help='Entry title')
create_parser.add_argument('--content', help='Entry content (HTML supported)')
create_parser.add_argument('--date', help='Entry date (YYYY-MM-DD)')
create_parser.add_argument('--attachments', nargs='+',
help='Files to attach to the new entry')
# Upload attachment command
upload_parser = subparsers.add_parser('upload', help='Upload attachment to entry')
upload_parser.add_argument('--entry-id', required=True, help='Entry ID')
upload_parser.add_argument('--file', required=True, help='File to upload')
# Batch upload command
batch_parser = subparsers.add_parser('batch-upload',
help='Upload all files from directory')
batch_parser.add_argument('--entry-id', required=True, help='Entry ID')
batch_parser.add_argument('--directory', required=True,
help='Directory containing files to upload')
# Comment command
comment_parser = subparsers.add_parser('comment', help='Add comment to entry')
comment_parser.add_argument('--entry-id', required=True, help='Entry ID')
comment_parser.add_argument('--text', required=True, help='Comment text')
args = parser.parse_args()
if not args.command:
parser.print_help()
sys.exit(1)
# Load configuration and initialize
config = load_config(args.config)
client = init_client(config)
uid = get_user_id(client, config)
# Execute command
if args.command == 'create':
if args.attachments:
create_entry_with_attachments(
client, config, uid, args.nbid, args.title,
args.content, args.attachments
)
else:
create_entry(client, uid, args.nbid, args.title,
args.content, args.date)
elif args.command == 'upload':
upload_attachment(client, config, uid, args.nbid,
args.entry_id, args.file)
elif args.command == 'batch-upload':
batch_upload(client, config, uid, args.nbid,
args.entry_id, args.directory)
elif args.command == 'comment':
create_comment(client, uid, args.nbid, args.entry_id, args.text)
if __name__ == '__main__':
main()

View File

@@ -0,0 +1,269 @@
#!/usr/bin/env python3
"""
LabArchives Notebook Operations
Utilities for listing, backing up, and managing LabArchives notebooks.
"""
import argparse
import sys
import yaml
from datetime import datetime
from pathlib import Path
def load_config(config_path='config.yaml'):
"""Load configuration from YAML file"""
try:
with open(config_path, 'r') as f:
return yaml.safe_load(f)
except FileNotFoundError:
print(f"❌ Configuration file not found: {config_path}")
print(" Run setup_config.py first to create configuration")
sys.exit(1)
except Exception as e:
print(f"❌ Error loading configuration: {e}")
sys.exit(1)
def init_client(config):
"""Initialize LabArchives API client"""
try:
from labarchivespy.client import Client
return Client(
config['api_url'],
config['access_key_id'],
config['access_password']
)
except ImportError:
print("❌ labarchives-py package not installed")
print(" Install with: pip install git+https://github.com/mcmero/labarchives-py")
sys.exit(1)
def get_user_id(client, config):
"""Get user ID via authentication"""
import xml.etree.ElementTree as ET
login_params = {
'login_or_email': config['user_email'],
'password': config['user_external_password']
}
try:
response = client.make_call('users', 'user_access_info', params=login_params)
if response.status_code == 200:
uid = ET.fromstring(response.content)[0].text
return uid
else:
print(f"❌ Authentication failed: HTTP {response.status_code}")
print(f" Response: {response.content.decode('utf-8')[:200]}")
sys.exit(1)
except Exception as e:
print(f"❌ Error during authentication: {e}")
sys.exit(1)
def list_notebooks(client, uid):
"""List all accessible notebooks for a user"""
import xml.etree.ElementTree as ET
print(f"\n📚 Listing notebooks for user ID: {uid}\n")
# Get user access info which includes notebook list
login_params = {'uid': uid}
try:
response = client.make_call('users', 'user_access_info', params=login_params)
if response.status_code == 200:
root = ET.fromstring(response.content)
notebooks = root.findall('.//notebook')
if not notebooks:
print("No notebooks found")
return []
notebook_list = []
print(f"{'Notebook ID':<15} {'Name':<40} {'Role':<10}")
print("-" * 70)
for nb in notebooks:
nbid = nb.find('nbid').text if nb.find('nbid') is not None else 'N/A'
name = nb.find('name').text if nb.find('name') is not None else 'Unnamed'
role = nb.find('role').text if nb.find('role') is not None else 'N/A'
notebook_list.append({'nbid': nbid, 'name': name, 'role': role})
print(f"{nbid:<15} {name:<40} {role:<10}")
print(f"\nTotal notebooks: {len(notebooks)}")
return notebook_list
else:
print(f"❌ Failed to list notebooks: HTTP {response.status_code}")
return []
except Exception as e:
print(f"❌ Error listing notebooks: {e}")
return []
def backup_notebook(client, uid, nbid, output_dir='backups', json_format=False,
no_attachments=False):
"""Backup a notebook"""
print(f"\n💾 Backing up notebook {nbid}...")
# Create output directory
output_path = Path(output_dir)
output_path.mkdir(exist_ok=True)
# Prepare parameters
params = {
'uid': uid,
'nbid': nbid,
'json': 'true' if json_format else 'false',
'no_attachments': 'true' if no_attachments else 'false'
}
try:
response = client.make_call('notebooks', 'notebook_backup', params=params)
if response.status_code == 200:
# Determine file extension
timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
if no_attachments:
ext = 'json' if json_format else 'xml'
filename = f"notebook_{nbid}_{timestamp}.{ext}"
else:
filename = f"notebook_{nbid}_{timestamp}.7z"
output_file = output_path / filename
# Write to file
with open(output_file, 'wb') as f:
f.write(response.content)
file_size = output_file.stat().st_size / (1024 * 1024) # MB
print(f"✅ Backup saved: {output_file}")
print(f" File size: {file_size:.2f} MB")
return str(output_file)
else:
print(f"❌ Backup failed: HTTP {response.status_code}")
print(f" Response: {response.content.decode('utf-8')[:200]}")
return None
except Exception as e:
print(f"❌ Error during backup: {e}")
return None
def backup_all_notebooks(client, uid, output_dir='backups', json_format=False,
no_attachments=False):
"""Backup all accessible notebooks"""
print("\n📦 Backing up all notebooks...\n")
notebooks = list_notebooks(client, uid)
if not notebooks:
print("No notebooks to backup")
return
successful = 0
failed = 0
for nb in notebooks:
nbid = nb['nbid']
name = nb['name']
print(f"\n--- Backing up: {name} (ID: {nbid}) ---")
result = backup_notebook(client, uid, nbid, output_dir, json_format, no_attachments)
if result:
successful += 1
else:
failed += 1
print("\n" + "="*60)
print(f"Backup complete: {successful} successful, {failed} failed")
print("="*60)
def main():
"""Main command-line interface"""
parser = argparse.ArgumentParser(
description='LabArchives Notebook Operations',
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
Examples:
# List all notebooks
python3 notebook_operations.py list
# Backup specific notebook
python3 notebook_operations.py backup --nbid 12345
# Backup all notebooks (JSON format, no attachments)
python3 notebook_operations.py backup-all --json --no-attachments
# Backup to custom directory
python3 notebook_operations.py backup --nbid 12345 --output my_backups/
"""
)
parser.add_argument('--config', default='config.yaml',
help='Path to configuration file (default: config.yaml)')
subparsers = parser.add_subparsers(dest='command', help='Command to execute')
# List command
subparsers.add_parser('list', help='List all accessible notebooks')
# Backup command
backup_parser = subparsers.add_parser('backup', help='Backup a specific notebook')
backup_parser.add_argument('--nbid', required=True, help='Notebook ID to backup')
backup_parser.add_argument('--output', default='backups',
help='Output directory (default: backups)')
backup_parser.add_argument('--json', action='store_true',
help='Return data in JSON format instead of XML')
backup_parser.add_argument('--no-attachments', action='store_true',
help='Exclude attachments from backup')
# Backup all command
backup_all_parser = subparsers.add_parser('backup-all',
help='Backup all accessible notebooks')
backup_all_parser.add_argument('--output', default='backups',
help='Output directory (default: backups)')
backup_all_parser.add_argument('--json', action='store_true',
help='Return data in JSON format instead of XML')
backup_all_parser.add_argument('--no-attachments', action='store_true',
help='Exclude attachments from backup')
args = parser.parse_args()
if not args.command:
parser.print_help()
sys.exit(1)
# Load configuration and initialize
config = load_config(args.config)
client = init_client(config)
uid = get_user_id(client, config)
# Execute command
if args.command == 'list':
list_notebooks(client, uid)
elif args.command == 'backup':
backup_notebook(client, uid, args.nbid, args.output, args.json, args.no_attachments)
elif args.command == 'backup-all':
backup_all_notebooks(client, uid, args.output, args.json, args.no_attachments)
if __name__ == '__main__':
main()

View File

@@ -0,0 +1,205 @@
#!/usr/bin/env python3
"""
LabArchives Configuration Setup Script
This script helps create a config.yaml file with necessary credentials
for LabArchives API access.
"""
import yaml
import os
from pathlib import Path
def get_regional_endpoint():
"""Prompt user to select regional API endpoint"""
print("\nSelect your regional API endpoint:")
print("1. US/International (mynotebook.labarchives.com)")
print("2. Australia (aunotebook.labarchives.com)")
print("3. UK (uknotebook.labarchives.com)")
print("4. Custom endpoint")
choice = input("\nEnter choice (1-4): ").strip()
endpoints = {
'1': 'https://api.labarchives.com/api',
'2': 'https://auapi.labarchives.com/api',
'3': 'https://ukapi.labarchives.com/api'
}
if choice in endpoints:
return endpoints[choice]
elif choice == '4':
return input("Enter custom API endpoint URL: ").strip()
else:
print("Invalid choice, defaulting to US/International")
return endpoints['1']
def get_credentials():
"""Prompt user for API credentials"""
print("\n" + "="*60)
print("LabArchives API Credentials")
print("="*60)
print("\nYou need two sets of credentials:")
print("1. Institutional API credentials (from LabArchives administrator)")
print("2. User authentication credentials (from your account settings)")
print()
# Institutional credentials
print("Institutional Credentials:")
access_key_id = input(" Access Key ID: ").strip()
access_password = input(" Access Password: ").strip()
# User credentials
print("\nUser Credentials:")
user_email = input(" Your LabArchives email: ").strip()
print("\nExternal Applications Password:")
print("(Set this in your LabArchives Account Settings → Security & Privacy)")
user_password = input(" External Applications Password: ").strip()
return {
'access_key_id': access_key_id,
'access_password': access_password,
'user_email': user_email,
'user_external_password': user_password
}
def create_config_file(config_data, output_path='config.yaml'):
"""Create YAML configuration file"""
with open(output_path, 'w') as f:
yaml.dump(config_data, f, default_flow_style=False, sort_keys=False)
# Set file permissions to user read/write only for security
os.chmod(output_path, 0o600)
print(f"\n✅ Configuration saved to: {os.path.abspath(output_path)}")
print(" File permissions set to 600 (user read/write only)")
def verify_config(config_path='config.yaml'):
"""Verify configuration file can be loaded"""
try:
with open(config_path, 'r') as f:
config = yaml.safe_load(f)
required_keys = ['api_url', 'access_key_id', 'access_password',
'user_email', 'user_external_password']
missing = [key for key in required_keys if key not in config or not config[key]]
if missing:
print(f"\n⚠️ Warning: Missing required fields: {', '.join(missing)}")
return False
print("\n✅ Configuration file verified successfully")
return True
except Exception as e:
print(f"\n❌ Error verifying configuration: {e}")
return False
def test_authentication(config_path='config.yaml'):
"""Test authentication with LabArchives API"""
print("\nWould you like to test the connection? (requires labarchives-py package)")
test = input("Test connection? (y/n): ").strip().lower()
if test != 'y':
return
try:
# Try to import labarchives-py
from labarchivespy.client import Client
import xml.etree.ElementTree as ET
# Load config
with open(config_path, 'r') as f:
config = yaml.safe_load(f)
# Initialize client
print("\nInitializing client...")
client = Client(
config['api_url'],
config['access_key_id'],
config['access_password']
)
# Test authentication
print("Testing authentication...")
login_params = {
'login_or_email': config['user_email'],
'password': config['user_external_password']
}
response = client.make_call('users', 'user_access_info', params=login_params)
if response.status_code == 200:
# Extract UID
uid = ET.fromstring(response.content)[0].text
print(f"\n✅ Authentication successful!")
print(f" User ID: {uid}")
# Get notebook count
root = ET.fromstring(response.content)
notebooks = root.findall('.//notebook')
print(f" Accessible notebooks: {len(notebooks)}")
else:
print(f"\n❌ Authentication failed: HTTP {response.status_code}")
print(f" Response: {response.content.decode('utf-8')[:200]}")
except ImportError:
print("\n⚠️ labarchives-py package not installed")
print(" Install with: pip install git+https://github.com/mcmero/labarchives-py")
except Exception as e:
print(f"\n❌ Connection test failed: {e}")
def main():
"""Main setup workflow"""
print("="*60)
print("LabArchives API Configuration Setup")
print("="*60)
# Check if config already exists
if os.path.exists('config.yaml'):
print("\n⚠️ config.yaml already exists")
overwrite = input("Overwrite existing configuration? (y/n): ").strip().lower()
if overwrite != 'y':
print("Setup cancelled")
return
# Get configuration
api_url = get_regional_endpoint()
credentials = get_credentials()
# Combine configuration
config_data = {
'api_url': api_url,
**credentials
}
# Create config file
create_config_file(config_data)
# Verify
verify_config()
# Test connection
test_authentication()
print("\n" + "="*60)
print("Setup complete!")
print("="*60)
print("\nNext steps:")
print("1. Add config.yaml to .gitignore if using version control")
print("2. Use notebook_operations.py to list and backup notebooks")
print("3. Use entry_operations.py to create entries and upload files")
print("\nFor more information, see references/authentication_guide.md")
if __name__ == '__main__':
main()

View File

@@ -0,0 +1,241 @@
---
name: omero-integration
description: Toolkit for interacting with OMERO microscopy data management systems using Python. Use this skill when working with microscopy images stored in OMERO servers, retrieving datasets and screening data, analyzing pixel data from scientific images, creating or managing annotations and metadata, working with regions of interest (ROIs), batch processing images, creating OMERO scripts, or integrating OMERO data into computational workflows. Essential for researchers working with high-content screening data, multi-dimensional microscopy datasets, or collaborative image repositories.
---
# OMERO Integration
## Overview
OMERO is an open-source client-server platform for managing, visualizing, and analyzing microscopy images and associated metadata. This skill provides comprehensive guidance for using OMERO's Python API (omero-py) to programmatically interact with OMERO servers for data retrieval, analysis, and management.
## Core Capabilities
This skill covers eight major capability areas. Each is documented in detail in the references/ directory:
### 1. Connection & Session Management
**File**: `references/connection.md`
Establish secure connections to OMERO servers, manage sessions, handle authentication, and work with group contexts. Use this for initial setup and connection patterns.
**Common scenarios:**
- Connect to OMERO server with credentials
- Use existing session IDs
- Switch between group contexts
- Manage connection lifecycle with context managers
### 2. Data Access & Retrieval
**File**: `references/data_access.md`
Navigate OMERO's hierarchical data structure (Projects → Datasets → Images) and screening data (Screens → Plates → Wells). Retrieve objects, query by attributes, and access metadata.
**Common scenarios:**
- List all projects and datasets for a user
- Retrieve images by ID or dataset
- Access screening plate data
- Query objects with filters
### 3. Metadata & Annotations
**File**: `references/metadata.md`
Create and manage annotations including tags, key-value pairs, file attachments, and comments. Link annotations to images, datasets, or other objects.
**Common scenarios:**
- Add tags to images
- Attach analysis results as files
- Create custom key-value metadata
- Query annotations by namespace
### 4. Image Processing & Rendering
**File**: `references/image_processing.md`
Access raw pixel data as NumPy arrays, manipulate rendering settings, create derived images, and manage physical dimensions.
**Common scenarios:**
- Extract pixel data for computational analysis
- Generate thumbnail images
- Create maximum intensity projections
- Modify channel rendering settings
### 5. Regions of Interest (ROIs)
**File**: `references/rois.md`
Create, retrieve, and analyze ROIs with various shapes (rectangles, ellipses, polygons, masks, points, lines). Extract intensity statistics from ROI regions.
**Common scenarios:**
- Draw rectangular ROIs on images
- Create polygon masks for segmentation
- Analyze pixel intensities within ROIs
- Export ROI coordinates
### 6. OMERO Tables
**File**: `references/tables.md`
Store and query structured tabular data associated with OMERO objects. Useful for analysis results, measurements, and metadata.
**Common scenarios:**
- Store quantitative measurements for images
- Create tables with multiple column types
- Query table data with conditions
- Link tables to specific images or datasets
### 7. Scripts & Batch Operations
**File**: `references/scripts.md`
Create OMERO.scripts that run server-side for batch processing, automated workflows, and integration with OMERO clients.
**Common scenarios:**
- Process multiple images in batch
- Create automated analysis pipelines
- Generate summary statistics across datasets
- Export data in custom formats
### 8. Advanced Features
**File**: `references/advanced.md`
Covers permissions, filesets, cross-group queries, delete operations, and other advanced functionality.
**Common scenarios:**
- Handle group permissions
- Access original imported files
- Perform cross-group queries
- Delete objects with callbacks
## Installation
Install the OMERO Python bindings using pip or conda:
```bash
# Using pip
pip install omero-py
# Using conda
conda install -c conda-forge omero-py
```
**Requirements:**
- Python 3.7+
- Zeroc Ice 3.6+
- Access to an OMERO server (host, port, credentials)
**Best practice:** Use a Python virtual environment (venv, conda, or mamba) to isolate dependencies.
## Quick Start
Basic connection pattern:
```python
from omero.gateway import BlitzGateway
# Connect to OMERO server
conn = BlitzGateway(username, password, host=host, port=port)
connected = conn.connect()
if connected:
# Perform operations
for project in conn.listProjects():
print(project.getName())
# Always close connection
conn.close()
else:
print("Connection failed")
```
**Recommended pattern with context manager:**
```python
from omero.gateway import BlitzGateway
with BlitzGateway(username, password, host=host, port=port) as conn:
# Connection automatically managed
for project in conn.listProjects():
print(project.getName())
# Automatically closed on exit
```
## Selecting the Right Capability
**For data exploration:**
- Start with `references/connection.md` to establish connection
- Use `references/data_access.md` to navigate hierarchy
- Check `references/metadata.md` for annotation details
**For image analysis:**
- Use `references/image_processing.md` for pixel data access
- Use `references/rois.md` for region-based analysis
- Use `references/tables.md` to store results
**For automation:**
- Use `references/scripts.md` for server-side processing
- Use `references/data_access.md` for batch data retrieval
**For advanced operations:**
- Use `references/advanced.md` for permissions and deletion
- Check `references/connection.md` for cross-group queries
## Common Workflows
### Workflow 1: Retrieve and Analyze Images
1. Connect to OMERO server (`references/connection.md`)
2. Navigate to dataset (`references/data_access.md`)
3. Retrieve images from dataset (`references/data_access.md`)
4. Access pixel data as NumPy array (`references/image_processing.md`)
5. Perform analysis
6. Store results as table or file annotation (`references/tables.md` or `references/metadata.md`)
### Workflow 2: Batch ROI Analysis
1. Connect to OMERO server
2. Retrieve images with existing ROIs (`references/rois.md`)
3. For each image, get ROI shapes
4. Extract pixel intensities within ROIs (`references/rois.md`)
5. Store measurements in OMERO table (`references/tables.md`)
### Workflow 3: Create Analysis Script
1. Design analysis workflow
2. Use OMERO.scripts framework (`references/scripts.md`)
3. Access data through script parameters
4. Process images in batch
5. Generate outputs (new images, tables, files)
## Error Handling
Always wrap OMERO operations in try-except blocks and ensure connections are properly closed:
```python
from omero.gateway import BlitzGateway
import traceback
try:
conn = BlitzGateway(username, password, host=host, port=port)
if not conn.connect():
raise Exception("Connection failed")
# Perform operations
except Exception as e:
print(f"Error: {e}")
traceback.print_exc()
finally:
if conn:
conn.close()
```
## Additional Resources
- **Official Documentation**: https://omero.readthedocs.io/en/stable/developers/Python.html
- **BlitzGateway API**: https://omero.readthedocs.io/en/stable/developers/Python.html#omero-blitzgateway
- **OMERO Model**: https://omero.readthedocs.io/en/stable/developers/Model.html
- **Community Forum**: https://forum.image.sc/tag/omero
## Notes
- OMERO uses group-based permissions (READ-ONLY, READ-ANNOTATE, READ-WRITE)
- Images in OMERO are organized hierarchically: Project > Dataset > Image
- Screening data uses: Screen > Plate > Well > WellSample > Image
- Always close connections to free server resources
- Use context managers for automatic resource management
- Pixel data is returned as NumPy arrays for analysis

View File

@@ -0,0 +1,631 @@
# Advanced Features
This reference covers advanced OMERO operations including permissions, deletion, filesets, and administrative tasks.
## Deleting Objects
### Delete with Wait
```python
# Delete objects and wait for completion
project_ids = [1, 2, 3]
conn.deleteObjects("Project", project_ids, wait=True)
print("Deletion complete")
# Delete without waiting (asynchronous)
conn.deleteObjects("Dataset", [dataset_id], wait=False)
```
### Delete with Callback Monitoring
```python
from omero.callbacks import CmdCallbackI
# Start delete operation
handle = conn.deleteObjects("Project", [project_id])
# Create callback to monitor progress
cb = CmdCallbackI(conn.c, handle)
print("Deleting, please wait...")
# Poll for completion
while not cb.block(500): # Check every 500ms
print(".", end="", flush=True)
print("\nDeletion finished")
# Check for errors
response = cb.getResponse()
if isinstance(response, omero.cmd.ERR):
print("Error occurred:")
print(response)
else:
print("Deletion successful")
# Clean up
cb.close(True) # Also closes handle
```
### Delete Different Object Types
```python
# Delete images
image_ids = [101, 102, 103]
conn.deleteObjects("Image", image_ids, wait=True)
# Delete datasets
dataset_ids = [10, 11]
conn.deleteObjects("Dataset", dataset_ids, wait=True)
# Delete ROIs
roi_ids = [201, 202]
conn.deleteObjects("Roi", roi_ids, wait=True)
# Delete annotations
annotation_ids = [301, 302]
conn.deleteObjects("Annotation", annotation_ids, wait=True)
```
### Delete with Cascade
```python
# Deleting a project will cascade to contained datasets
# This behavior depends on server configuration
project_id = 123
conn.deleteObjects("Project", [project_id], wait=True)
# Datasets and images may be deleted or orphaned
# depending on delete specifications
```
## Filesets
Filesets represent collections of original imported files. They were introduced in OMERO 5.0.
### Check if Image Has Fileset
```python
image = conn.getObject("Image", image_id)
fileset = image.getFileset()
if fileset:
print(f"Image is part of fileset {fileset.getId()}")
else:
print("Image has no fileset (pre-OMERO 5.0)")
```
### Access Fileset Information
```python
image = conn.getObject("Image", image_id)
fileset = image.getFileset()
if fileset:
fs_id = fileset.getId()
print(f"Fileset ID: {fs_id}")
# List all images in this fileset
print("Images in fileset:")
for fs_image in fileset.copyImages():
print(f" {fs_image.getId()}: {fs_image.getName()}")
# List original imported files
print("\nOriginal files:")
for orig_file in fileset.listFiles():
print(f" {orig_file.getPath()}/{orig_file.getName()}")
print(f" Size: {orig_file.getSize()} bytes")
```
### Get Fileset Directly
```python
# Get fileset object
fileset = conn.getObject("Fileset", fileset_id)
if fileset:
# Access images
for image in fileset.copyImages():
print(f"Image: {image.getName()}")
# Access files
for orig_file in fileset.listFiles():
print(f"File: {orig_file.getName()}")
```
### Download Original Files
```python
import os
fileset = image.getFileset()
if fileset:
download_dir = "./original_files"
os.makedirs(download_dir, exist_ok=True)
for orig_file in fileset.listFiles():
file_name = orig_file.getName()
file_path = os.path.join(download_dir, file_name)
print(f"Downloading: {file_name}")
# Get file as RawFileStore
raw_file_store = conn.createRawFileStore()
raw_file_store.setFileId(orig_file.getId())
# Download in chunks
with open(file_path, 'wb') as f:
offset = 0
chunk_size = 1024 * 1024 # 1MB chunks
size = orig_file.getSize()
while offset < size:
chunk = raw_file_store.read(offset, chunk_size)
f.write(chunk)
offset += len(chunk)
raw_file_store.close()
print(f"Saved to: {file_path}")
```
## Group Permissions
OMERO uses group-based permissions to control data access.
### Permission Levels
- **PRIVATE** (`rw----`): Only owner can read/write
- **READ-ONLY** (`rwr---`): Group members can read, only owner can write
- **READ-ANNOTATE** (`rwra--`): Group members can read and annotate
- **READ-WRITE** (`rwrw--`): Group members can read and write
### Check Current Group Permissions
```python
# Get current group
group = conn.getGroupFromContext()
# Get permissions
permissions = group.getDetails().getPermissions()
perm_string = str(permissions)
# Map to readable names
permission_names = {
'rw----': 'PRIVATE',
'rwr---': 'READ-ONLY',
'rwra--': 'READ-ANNOTATE',
'rwrw--': 'READ-WRITE'
}
perm_name = permission_names.get(perm_string, 'UNKNOWN')
print(f"Group: {group.getName()}")
print(f"Permissions: {perm_name} ({perm_string})")
```
### List User's Groups
```python
# Get all groups for current user
print("User's groups:")
for group in conn.getGroupsMemberOf():
print(f" {group.getName()} (ID: {group.getId()})")
# Get group permissions
perms = group.getDetails().getPermissions()
print(f" Permissions: {perms}")
```
### Get Group Members
```python
# Get group
group = conn.getObject("ExperimenterGroup", group_id)
# List members
print(f"Members of {group.getName()}:")
for member in group.getMembers():
print(f" {member.getFullName()} ({member.getOmeName()})")
```
## Cross-Group Queries
### Query Across All Groups
```python
# Set context to query all accessible groups
conn.SERVICE_OPTS.setOmeroGroup('-1')
# Now queries span all groups
image = conn.getObject("Image", image_id)
if image:
group = image.getDetails().getGroup()
print(f"Image found in group: {group.getName()}")
# List projects across all groups
for project in conn.getObjects("Project"):
group = project.getDetails().getGroup()
print(f"Project: {project.getName()} (Group: {group.getName()})")
```
### Switch to Specific Group
```python
# Get image's group
image = conn.getObject("Image", image_id)
group_id = image.getDetails().getGroup().getId()
# Switch to that group's context
conn.SERVICE_OPTS.setOmeroGroup(group_id)
# Subsequent operations use this group
projects = conn.listProjects() # Only from this group
```
### Reset to Default Group
```python
# Get default group
default_group_id = conn.getEventContext().groupId
# Switch back to default
conn.SERVICE_OPTS.setOmeroGroup(default_group_id)
```
## Administrative Operations
### Check Admin Status
```python
# Check if current user is admin
if conn.isAdmin():
print("User has admin privileges")
# Check if full admin
if conn.isFullAdmin():
print("User is full administrator")
else:
# Check specific privileges
privileges = conn.getCurrentAdminPrivileges()
print(f"Admin privileges: {privileges}")
```
### List Administrators
```python
# Get all administrators
print("Administrators:")
for admin in conn.getAdministrators():
print(f" ID: {admin.getId()}")
print(f" Username: {admin.getOmeName()}")
print(f" Full Name: {admin.getFullName()}")
```
### Set Object Owner (Admin Only)
```python
import omero.model
# Create annotation with specific owner (requires admin)
tag_ann = omero.gateway.TagAnnotationWrapper(conn)
tag_ann.setValue("Admin-created tag")
# Set owner
user_id = 5
tag_ann._obj.details.owner = omero.model.ExperimenterI(user_id, False)
tag_ann.save()
print(f"Created annotation owned by user {user_id}")
```
### Substitute User Connection (Admin Only)
```python
# Connect as admin
admin_conn = BlitzGateway(admin_user, admin_pass, host=host, port=4064)
admin_conn.connect()
# Get target user
target_user_id = 10
user = admin_conn.getObject("Experimenter", target_user_id)
username = user.getOmeName()
# Create connection as that user
user_conn = admin_conn.suConn(username)
print(f"Connected as {username}")
# Perform operations as that user
for project in user_conn.listProjects():
print(f" {project.getName()}")
# Close connections
user_conn.close()
admin_conn.close()
```
### List All Users
```python
# Get all users (admin operation)
print("All users:")
for user in conn.getObjects("Experimenter"):
print(f" ID: {user.getId()}")
print(f" Username: {user.getOmeName()}")
print(f" Full Name: {user.getFullName()}")
print(f" Email: {user.getEmail()}")
print()
```
## Service Access
OMERO provides various services for specific operations.
### Update Service
```python
# Get update service
updateService = conn.getUpdateService()
# Save and return object
roi = omero.model.RoiI()
roi.setImage(image._obj)
saved_roi = updateService.saveAndReturnObject(roi)
# Save multiple objects
objects = [obj1, obj2, obj3]
saved_objects = updateService.saveAndReturnArray(objects)
```
### ROI Service
```python
# Get ROI service
roi_service = conn.getRoiService()
# Find ROIs for image
result = roi_service.findByImage(image_id, None)
# Get shape statistics
shape_ids = [shape.id.val for roi in result.rois
for shape in roi.copyShapes()]
stats = roi_service.getShapeStatsRestricted(shape_ids, 0, 0, [0])
```
### Metadata Service
```python
# Get metadata service
metadataService = conn.getMetadataService()
# Load annotations by type and namespace
ns_to_include = ["mylab.analysis"]
ns_to_exclude = []
annotations = metadataService.loadSpecifiedAnnotations(
'omero.model.FileAnnotation',
ns_to_include,
ns_to_exclude,
None
)
for ann in annotations:
print(f"Annotation: {ann.getId().getValue()}")
```
### Query Service
```python
# Get query service
queryService = conn.getQueryService()
# Build query (more complex queries)
params = omero.sys.ParametersI()
params.addLong("image_id", image_id)
query = "select i from Image i where i.id = :image_id"
image = queryService.findByQuery(query, params)
```
### Thumbnail Service
```python
# Get thumbnail service
thumbnailService = conn.createThumbnailStore()
# Set current image
thumbnailService.setPixelsId(image.getPrimaryPixels().getId())
# Get thumbnail
thumbnail = thumbnailService.getThumbnail(96, 96)
# Close service
thumbnailService.close()
```
### Raw File Store
```python
# Get raw file store
rawFileStore = conn.createRawFileStore()
# Set file ID
rawFileStore.setFileId(orig_file_id)
# Read file
data = rawFileStore.read(0, rawFileStore.size())
# Close
rawFileStore.close()
```
## Object Ownership and Details
### Get Object Details
```python
image = conn.getObject("Image", image_id)
# Get details
details = image.getDetails()
# Owner information
owner = details.getOwner()
print(f"Owner ID: {owner.getId()}")
print(f"Username: {owner.getOmeName()}")
print(f"Full Name: {owner.getFullName()}")
# Group information
group = details.getGroup()
print(f"Group: {group.getName()} (ID: {group.getId()})")
# Creation information
creation_event = details.getCreationEvent()
print(f"Created: {creation_event.getTime()}")
# Update information
update_event = details.getUpdateEvent()
print(f"Updated: {update_event.getTime()}")
```
### Get Permissions
```python
# Get object permissions
details = image.getDetails()
permissions = details.getPermissions()
# Check specific permissions
can_edit = permissions.canEdit()
can_annotate = permissions.canAnnotate()
can_link = permissions.canLink()
can_delete = permissions.canDelete()
print(f"Can edit: {can_edit}")
print(f"Can annotate: {can_annotate}")
print(f"Can link: {can_link}")
print(f"Can delete: {can_delete}")
```
## Event Context
### Get Current Event Context
```python
# Get event context (current session info)
ctx = conn.getEventContext()
print(f"User ID: {ctx.userId}")
print(f"Username: {ctx.userName}")
print(f"Group ID: {ctx.groupId}")
print(f"Group Name: {ctx.groupName}")
print(f"Session ID: {ctx.sessionId}")
print(f"Is Admin: {ctx.isAdmin}")
```
## Complete Admin Example
```python
from omero.gateway import BlitzGateway
# Connect as admin
ADMIN_USER = 'root'
ADMIN_PASS = 'password'
HOST = 'omero.example.com'
PORT = 4064
with BlitzGateway(ADMIN_USER, ADMIN_PASS, host=HOST, port=PORT) as admin_conn:
print("=== Administrator Operations ===\n")
# List all users
print("All Users:")
for user in admin_conn.getObjects("Experimenter"):
print(f" {user.getOmeName()}: {user.getFullName()}")
# List all groups
print("\nAll Groups:")
for group in admin_conn.getObjects("ExperimenterGroup"):
perms = group.getDetails().getPermissions()
print(f" {group.getName()}: {perms}")
# List members
for member in group.getMembers():
print(f" - {member.getOmeName()}")
# Query across all groups
print("\nAll Projects (all groups):")
admin_conn.SERVICE_OPTS.setOmeroGroup('-1')
for project in admin_conn.getObjects("Project"):
owner = project.getDetails().getOwner()
group = project.getDetails().getGroup()
print(f" {project.getName()}")
print(f" Owner: {owner.getOmeName()}")
print(f" Group: {group.getName()}")
# Connect as another user
target_user_id = 5
user = admin_conn.getObject("Experimenter", target_user_id)
if user:
print(f"\n=== Operating as {user.getOmeName()} ===\n")
user_conn = admin_conn.suConn(user.getOmeName())
# List that user's projects
for project in user_conn.listProjects():
print(f" {project.getName()}")
user_conn.close()
```
## Best Practices
1. **Permissions**: Always check permissions before operations
2. **Group Context**: Set appropriate group context for queries
3. **Admin Operations**: Use admin privileges sparingly and carefully
4. **Delete Confirmation**: Always confirm before deleting objects
5. **Callback Monitoring**: Monitor long delete operations with callbacks
6. **Fileset Awareness**: Check for filesets when working with images
7. **Service Cleanup**: Close services when done (thumbnailStore, rawFileStore)
8. **Cross-Group Queries**: Use `-1` group ID for cross-group access
9. **Error Handling**: Always handle permission and access errors
10. **Documentation**: Document administrative operations clearly
## Troubleshooting
### Permission Denied
```python
try:
conn.deleteObjects("Project", [project_id], wait=True)
except Exception as e:
if "SecurityViolation" in str(e):
print("Permission denied: You don't own this object")
else:
raise
```
### Object Not Found
```python
# Check if object exists before accessing
obj = conn.getObject("Image", image_id)
if obj is None:
print(f"Image {image_id} not found or not accessible")
else:
# Process object
pass
```
### Group Context Issues
```python
# If object not found, try cross-group query
conn.SERVICE_OPTS.setOmeroGroup('-1')
obj = conn.getObject("Image", image_id)
if obj:
# Switch to object's group for further operations
group_id = obj.getDetails().getGroup().getId()
conn.SERVICE_OPTS.setOmeroGroup(group_id)
```

View File

@@ -0,0 +1,369 @@
# Connection & Session Management
This reference covers establishing and managing connections to OMERO servers using BlitzGateway.
## Basic Connection
### Standard Connection Pattern
```python
from omero.gateway import BlitzGateway
# Create connection
conn = BlitzGateway(username, password, host=host, port=4064)
# Connect to server
if conn.connect():
print("Connected successfully")
# Perform operations
conn.close()
else:
print("Failed to connect")
```
### Connection Parameters
- **username** (str): OMERO user account name
- **password** (str): User password
- **host** (str): OMERO server hostname or IP address
- **port** (int): Server port (default: 4064)
- **secure** (bool): Force encrypted connection (default: False)
### Secure Connection
To ensure all data transfers are encrypted:
```python
conn = BlitzGateway(username, password, host=host, port=4064, secure=True)
conn.connect()
```
## Context Manager Pattern (Recommended)
Use context managers for automatic connection management and cleanup:
```python
from omero.gateway import BlitzGateway
with BlitzGateway(username, password, host=host, port=4064) as conn:
# Connection automatically established
for project in conn.getObjects('Project'):
print(project.getName())
# Connection automatically closed on exit
```
**Benefits:**
- Automatic `connect()` call
- Automatic `close()` call on exit
- Exception-safe resource cleanup
- Cleaner code
## Session Management
### Connection from Existing Client
Create BlitzGateway from an existing `omero.client` session:
```python
import omero.clients
from omero.gateway import BlitzGateway
# Create client and session
client = omero.client(host, port)
session = client.createSession(username, password)
# Create BlitzGateway from existing client
conn = BlitzGateway(client_obj=client)
# Use connection
# ...
# Close when done
conn.close()
```
### Retrieve Session Information
```python
# Get current user information
user = conn.getUser()
print(f"User ID: {user.getId()}")
print(f"Username: {user.getName()}")
print(f"Full Name: {user.getFullName()}")
print(f"Is Admin: {conn.isAdmin()}")
# Get current group
group = conn.getGroupFromContext()
print(f"Current Group: {group.getName()}")
print(f"Group ID: {group.getId()}")
```
### Check Admin Privileges
```python
if conn.isAdmin():
print("User has admin privileges")
if conn.isFullAdmin():
print("User is full administrator")
else:
# Check specific admin privileges
privileges = conn.getCurrentAdminPrivileges()
print(f"Admin privileges: {privileges}")
```
## Group Context Management
OMERO uses groups to manage data access permissions. Users can belong to multiple groups.
### Get Current Group Context
```python
# Get the current group context
group = conn.getGroupFromContext()
print(f"Current group: {group.getName()}")
print(f"Group ID: {group.getId()}")
```
### Query Across All Groups
Use group ID `-1` to query across all accessible groups:
```python
# Set context to query all groups
conn.SERVICE_OPTS.setOmeroGroup('-1')
# Now queries span all accessible groups
image = conn.getObject("Image", image_id)
projects = conn.listProjects()
```
### Switch to Specific Group
Switch context to work within a specific group:
```python
# Get group ID from an object
image = conn.getObject("Image", image_id)
group_id = image.getDetails().getGroup().getId()
# Switch to that group's context
conn.SERVICE_OPTS.setOmeroGroup(group_id)
# Subsequent operations use this group context
projects = conn.listProjects()
```
### List Available Groups
```python
# Get all groups for current user
for group in conn.getGroupsMemberOf():
print(f"Group: {group.getName()} (ID: {group.getId()})")
```
## Advanced Connection Features
### Substitute User Connection (Admin Only)
Administrators can create connections acting as other users:
```python
# Connect as admin
admin_conn = BlitzGateway(admin_user, admin_pass, host=host, port=4064)
admin_conn.connect()
# Get target user
target_user = admin_conn.getObject("Experimenter", user_id).getName()
# Create connection as that user
user_conn = admin_conn.suConn(target_user)
# Operations performed as target user
for project in user_conn.listProjects():
print(project.getName())
# Close substitute connection
user_conn.close()
admin_conn.close()
```
### List Administrators
```python
# Get all administrators
for admin in conn.getAdministrators():
print(f"ID: {admin.getId()}, Name: {admin.getFullName()}, "
f"Username: {admin.getOmeName()}")
```
## Connection Lifecycle
### Closing Connections
Always close connections to free server resources:
```python
try:
conn = BlitzGateway(username, password, host=host, port=4064)
conn.connect()
# Perform operations
except Exception as e:
print(f"Error: {e}")
finally:
if conn:
conn.close()
```
### Check Connection Status
```python
if conn.isConnected():
print("Connection is active")
else:
print("Connection is closed")
```
## Error Handling
### Robust Connection Pattern
```python
from omero.gateway import BlitzGateway
import traceback
def connect_to_omero(username, password, host, port=4064):
"""
Establish connection to OMERO server with error handling.
Returns:
BlitzGateway connection object or None if failed
"""
try:
conn = BlitzGateway(username, password, host=host, port=port, secure=True)
if conn.connect():
print(f"Connected to {host}:{port} as {username}")
return conn
else:
print("Failed to establish connection")
return None
except Exception as e:
print(f"Connection error: {e}")
traceback.print_exc()
return None
# Usage
conn = connect_to_omero(username, password, host)
if conn:
try:
# Perform operations
pass
finally:
conn.close()
```
## Common Connection Patterns
### Pattern 1: Simple Script
```python
from omero.gateway import BlitzGateway
# Connection parameters
HOST = 'omero.example.com'
PORT = 4064
USERNAME = 'user'
PASSWORD = 'pass'
# Connect
with BlitzGateway(USERNAME, PASSWORD, host=HOST, port=PORT) as conn:
print(f"Connected as {conn.getUser().getName()}")
# Perform operations
```
### Pattern 2: Configuration-Based Connection
```python
import yaml
from omero.gateway import BlitzGateway
# Load configuration
with open('omero_config.yaml', 'r') as f:
config = yaml.safe_load(f)
# Connect using config
with BlitzGateway(
config['username'],
config['password'],
host=config['host'],
port=config.get('port', 4064),
secure=config.get('secure', True)
) as conn:
# Perform operations
pass
```
### Pattern 3: Environment Variables
```python
import os
from omero.gateway import BlitzGateway
# Get credentials from environment
USERNAME = os.environ.get('OMERO_USER')
PASSWORD = os.environ.get('OMERO_PASSWORD')
HOST = os.environ.get('OMERO_HOST', 'localhost')
PORT = int(os.environ.get('OMERO_PORT', 4064))
# Connect
with BlitzGateway(USERNAME, PASSWORD, host=HOST, port=PORT) as conn:
# Perform operations
pass
```
## Best Practices
1. **Use Context Managers**: Always prefer context managers for automatic cleanup
2. **Secure Connections**: Use `secure=True` for production environments
3. **Error Handling**: Wrap connection code in try-except blocks
4. **Close Connections**: Always close connections when done
5. **Group Context**: Set appropriate group context before queries
6. **Credential Security**: Never hardcode credentials; use environment variables or config files
7. **Connection Pooling**: For web applications, implement connection pooling
8. **Timeouts**: Consider implementing connection timeouts for long-running operations
## Troubleshooting
### Connection Refused
```
Unable to contact ORB
```
**Solutions:**
- Verify host and port are correct
- Check firewall settings
- Ensure OMERO server is running
- Verify network connectivity
### Authentication Failed
```
Cannot connect to server
```
**Solutions:**
- Verify username and password
- Check user account is active
- Verify group membership
- Check server logs for details
### Session Timeout
**Solutions:**
- Increase session timeout on server
- Implement session keepalive
- Reconnect on timeout
- Use connection pools for long-running applications

View File

@@ -0,0 +1,544 @@
# Data Access & Retrieval
This reference covers navigating OMERO's hierarchical data structure and retrieving objects.
## OMERO Data Hierarchy
### Standard Hierarchy
```
Project
└─ Dataset
└─ Image
```
### Screening Hierarchy
```
Screen
└─ Plate
└─ Well
└─ WellSample
└─ Image
```
## Listing Objects
### List Projects
```python
# List all projects for current user
for project in conn.listProjects():
print(f"Project: {project.getName()} (ID: {project.getId()})")
```
### List Projects with Filtering
```python
# Get current user and group
my_exp_id = conn.getUser().getId()
default_group_id = conn.getEventContext().groupId
# List projects with filters
for project in conn.getObjects("Project", opts={
'owner': my_exp_id, # Filter by owner
'group': default_group_id, # Filter by group
'order_by': 'lower(obj.name)', # Sort alphabetically
'limit': 10, # Limit results
'offset': 0 # Pagination offset
}):
print(f"Project: {project.getName()}")
```
### List Datasets
```python
# List all datasets
for dataset in conn.getObjects("Dataset"):
print(f"Dataset: {dataset.getName()} (ID: {dataset.getId()})")
# List orphaned datasets (not in any project)
for dataset in conn.getObjects("Dataset", opts={'orphaned': True}):
print(f"Orphaned Dataset: {dataset.getName()}")
```
### List Images
```python
# List all images
for image in conn.getObjects("Image"):
print(f"Image: {image.getName()} (ID: {image.getId()})")
# List images in specific dataset
dataset_id = 123
for image in conn.getObjects("Image", opts={'dataset': dataset_id}):
print(f"Image: {image.getName()}")
# List orphaned images
for image in conn.getObjects("Image", opts={'orphaned': True}):
print(f"Orphaned Image: {image.getName()}")
```
## Retrieving Objects by ID
### Get Single Object
```python
# Get project by ID
project = conn.getObject("Project", project_id)
if project:
print(f"Project: {project.getName()}")
else:
print("Project not found")
# Get dataset by ID
dataset = conn.getObject("Dataset", dataset_id)
# Get image by ID
image = conn.getObject("Image", image_id)
```
### Get Multiple Objects by ID
```python
# Get multiple projects at once
project_ids = [1, 2, 3, 4, 5]
projects = conn.getObjects("Project", project_ids)
for project in projects:
print(f"Project: {project.getName()}")
```
### Supported Object Types
The `getObject()` and `getObjects()` methods support:
- `"Project"`
- `"Dataset"`
- `"Image"`
- `"Screen"`
- `"Plate"`
- `"Well"`
- `"Roi"`
- `"Annotation"` (and specific types: `"TagAnnotation"`, `"FileAnnotation"`, etc.)
- `"Experimenter"`
- `"ExperimenterGroup"`
- `"Fileset"`
## Query by Attributes
### Query Objects by Name
```python
# Find images with specific name
images = conn.getObjects("Image", attributes={"name": "sample_001.tif"})
for image in images:
print(f"Found image: {image.getName()} (ID: {image.getId()})")
# Find datasets with specific name
datasets = conn.getObjects("Dataset", attributes={"name": "Control Group"})
```
### Query Annotations by Value
```python
# Find tags with specific text value
tags = conn.getObjects("TagAnnotation",
attributes={"textValue": "experiment_tag"})
for tag in tags:
print(f"Tag: {tag.getValue()}")
# Find map annotations
map_anns = conn.getObjects("MapAnnotation",
attributes={"ns": "custom.namespace"})
```
## Navigating Hierarchies
### Navigate Down (Parent to Children)
```python
# Project → Datasets → Images
project = conn.getObject("Project", project_id)
for dataset in project.listChildren():
print(f"Dataset: {dataset.getName()}")
for image in dataset.listChildren():
print(f" Image: {image.getName()}")
```
### Navigate Up (Child to Parent)
```python
# Image → Dataset → Project
image = conn.getObject("Image", image_id)
# Get parent dataset
dataset = image.getParent()
if dataset:
print(f"Dataset: {dataset.getName()}")
# Get parent project
project = dataset.getParent()
if project:
print(f"Project: {project.getName()}")
```
### Complete Hierarchy Traversal
```python
# Traverse complete project hierarchy
for project in conn.getObjects("Project", opts={'order_by': 'lower(obj.name)'}):
print(f"Project: {project.getName()} (ID: {project.getId()})")
for dataset in project.listChildren():
image_count = dataset.countChildren()
print(f" Dataset: {dataset.getName()} ({image_count} images)")
for image in dataset.listChildren():
print(f" Image: {image.getName()}")
print(f" Size: {image.getSizeX()} x {image.getSizeY()}")
print(f" Channels: {image.getSizeC()}")
```
## Screening Data Access
### List Screens and Plates
```python
# List all screens
for screen in conn.getObjects("Screen"):
print(f"Screen: {screen.getName()} (ID: {screen.getId()})")
# List plates in screen
for plate in screen.listChildren():
print(f" Plate: {plate.getName()} (ID: {plate.getId()})")
```
### Access Plate Wells
```python
# Get plate
plate = conn.getObject("Plate", plate_id)
# Plate metadata
print(f"Plate: {plate.getName()}")
print(f"Grid size: {plate.getGridSize()}") # e.g., (8, 12) for 96-well
print(f"Number of fields: {plate.getNumberOfFields()}")
# Iterate through wells
for well in plate.listChildren():
print(f"Well at row {well.row}, column {well.column}")
# Count images in well (fields)
field_count = well.countWellSample()
print(f" Number of fields: {field_count}")
# Access images in well
for index in range(field_count):
image = well.getImage(index)
print(f" Field {index}: {image.getName()}")
```
### Direct Well Access
```python
# Get specific well by row and column
well = plate.getWell(row=0, column=0) # Top-left well
# Get image from well
if well.countWellSample() > 0:
image = well.getImage(0) # First field
print(f"Image: {image.getName()}")
```
### Well Sample Access
```python
# Access well samples directly
for well in plate.listChildren():
for ws in well.listChildren(): # ws = WellSample
image = ws.getImage()
print(f"WellSample {ws.getId()}: {image.getName()}")
```
## Image Properties
### Basic Dimensions
```python
image = conn.getObject("Image", image_id)
# Pixel dimensions
print(f"X: {image.getSizeX()}")
print(f"Y: {image.getSizeY()}")
print(f"Z: {image.getSizeZ()} (Z-sections)")
print(f"C: {image.getSizeC()} (Channels)")
print(f"T: {image.getSizeT()} (Time points)")
# Image type
print(f"Type: {image.getPixelsType()}") # e.g., 'uint16', 'uint8'
```
### Physical Dimensions
```python
# Get pixel sizes with units (OMERO 5.1.0+)
size_x_obj = image.getPixelSizeX(units=True)
size_y_obj = image.getPixelSizeY(units=True)
size_z_obj = image.getPixelSizeZ(units=True)
print(f"Pixel Size X: {size_x_obj.getValue()} {size_x_obj.getSymbol()}")
print(f"Pixel Size Y: {size_y_obj.getValue()} {size_y_obj.getSymbol()}")
print(f"Pixel Size Z: {size_z_obj.getValue()} {size_z_obj.getSymbol()}")
# Get as floats (micrometers)
size_x = image.getPixelSizeX() # Returns float in µm
size_y = image.getPixelSizeY()
size_z = image.getPixelSizeZ()
```
### Channel Information
```python
# Iterate through channels
for channel in image.getChannels():
print(f"Channel {channel.getLabel()}:")
print(f" Color: {channel.getColor().getRGB()}")
print(f" Lookup Table: {channel.getLut()}")
print(f" Wavelength: {channel.getEmissionWave()}")
```
### Image Metadata
```python
# Acquisition date
acquired = image.getAcquisitionDate()
if acquired:
print(f"Acquired: {acquired}")
# Description
description = image.getDescription()
if description:
print(f"Description: {description}")
# Owner and group
details = image.getDetails()
print(f"Owner: {details.getOwner().getFullName()}")
print(f"Username: {details.getOwner().getOmeName()}")
print(f"Group: {details.getGroup().getName()}")
print(f"Created: {details.getCreationEvent().getTime()}")
```
## Object Ownership and Permissions
### Get Owner Information
```python
# Get object owner
obj = conn.getObject("Dataset", dataset_id)
owner = obj.getDetails().getOwner()
print(f"Owner ID: {owner.getId()}")
print(f"Username: {owner.getOmeName()}")
print(f"Full Name: {owner.getFullName()}")
print(f"Email: {owner.getEmail()}")
```
### Get Group Information
```python
# Get object's group
obj = conn.getObject("Image", image_id)
group = obj.getDetails().getGroup()
print(f"Group: {group.getName()} (ID: {group.getId()})")
```
### Filter by Owner
```python
# Get objects for specific user
user_id = 5
datasets = conn.getObjects("Dataset", opts={'owner': user_id})
for dataset in datasets:
print(f"Dataset: {dataset.getName()}")
```
## Advanced Queries
### Pagination
```python
# Paginate through large result sets
page_size = 50
offset = 0
while True:
images = list(conn.getObjects("Image", opts={
'limit': page_size,
'offset': offset,
'order_by': 'obj.id'
}))
if not images:
break
for image in images:
print(f"Image: {image.getName()}")
offset += page_size
```
### Sorting Results
```python
# Sort by name (case-insensitive)
projects = conn.getObjects("Project", opts={
'order_by': 'lower(obj.name)'
})
# Sort by ID (ascending)
datasets = conn.getObjects("Dataset", opts={
'order_by': 'obj.id'
})
# Sort by name (descending)
images = conn.getObjects("Image", opts={
'order_by': 'lower(obj.name) desc'
})
```
### Combining Filters
```python
# Complex query with multiple filters
my_exp_id = conn.getUser().getId()
default_group_id = conn.getEventContext().groupId
images = conn.getObjects("Image", opts={
'owner': my_exp_id,
'group': default_group_id,
'dataset': dataset_id,
'order_by': 'lower(obj.name)',
'limit': 100,
'offset': 0
})
```
## Counting Objects
### Count Children
```python
# Count images in dataset
dataset = conn.getObject("Dataset", dataset_id)
image_count = dataset.countChildren()
print(f"Dataset contains {image_count} images")
# Count datasets in project
project = conn.getObject("Project", project_id)
dataset_count = project.countChildren()
print(f"Project contains {dataset_count} datasets")
```
### Count Annotations
```python
# Count annotations on object
image = conn.getObject("Image", image_id)
annotation_count = image.countAnnotations()
print(f"Image has {annotation_count} annotations")
```
## Orphaned Objects
### Find Orphaned Datasets
```python
# Datasets not linked to any project
orphaned_datasets = conn.getObjects("Dataset", opts={'orphaned': True})
print("Orphaned Datasets:")
for dataset in orphaned_datasets:
print(f" {dataset.getName()} (ID: {dataset.getId()})")
print(f" Owner: {dataset.getDetails().getOwner().getOmeName()}")
print(f" Images: {dataset.countChildren()}")
```
### Find Orphaned Images
```python
# Images not in any dataset
orphaned_images = conn.getObjects("Image", opts={'orphaned': True})
print("Orphaned Images:")
for image in orphaned_images:
print(f" {image.getName()} (ID: {image.getId()})")
```
### Find Orphaned Plates
```python
# Plates not in any screen
orphaned_plates = conn.getObjects("Plate", opts={'orphaned': True})
for plate in orphaned_plates:
print(f"Orphaned Plate: {plate.getName()}")
```
## Complete Example
```python
from omero.gateway import BlitzGateway
# Connection details
HOST = 'omero.example.com'
PORT = 4064
USERNAME = 'user'
PASSWORD = 'pass'
# Connect and query data
with BlitzGateway(USERNAME, PASSWORD, host=HOST, port=PORT) as conn:
# Get user context
user = conn.getUser()
group = conn.getGroupFromContext()
print(f"Connected as {user.getName()} in group {group.getName()}")
print()
# List projects with datasets and images
for project in conn.getObjects("Project", opts={'limit': 5}):
print(f"Project: {project.getName()} (ID: {project.getId()})")
for dataset in project.listChildren():
image_count = dataset.countChildren()
print(f" Dataset: {dataset.getName()} ({image_count} images)")
# Show first 3 images
for idx, image in enumerate(dataset.listChildren()):
if idx >= 3:
print(f" ... and {image_count - 3} more")
break
print(f" Image: {image.getName()}")
print(f" Size: {image.getSizeX()}x{image.getSizeY()}")
print(f" Channels: {image.getSizeC()}, Z: {image.getSizeZ()}")
print()
```
## Best Practices
1. **Use Context Managers**: Always use `with` statements for automatic connection cleanup
2. **Limit Results**: Use `limit` and `offset` for large datasets
3. **Filter Early**: Apply filters to reduce data transfer
4. **Check for None**: Always check if `getObject()` returns None before using
5. **Efficient Traversal**: Use `listChildren()` instead of querying separately
6. **Count Before Loading**: Use `countChildren()` to decide whether to load data
7. **Group Context**: Set appropriate group context before cross-group queries
8. **Pagination**: Implement pagination for large result sets
9. **Object Reuse**: Cache frequently accessed objects to reduce queries
10. **Error Handling**: Wrap queries in try-except blocks for robustness

View File

@@ -0,0 +1,665 @@
# Image Processing & Rendering
This reference covers accessing raw pixel data, image rendering, and creating new images in OMERO.
## Accessing Raw Pixel Data
### Get Single Plane
```python
# Get image
image = conn.getObject("Image", image_id)
# Get dimensions
size_z = image.getSizeZ()
size_c = image.getSizeC()
size_t = image.getSizeT()
# Get pixels object
pixels = image.getPrimaryPixels()
# Get single plane (returns NumPy array)
z, c, t = 0, 0, 0 # First Z-section, channel, and timepoint
plane = pixels.getPlane(z, c, t)
print(f"Shape: {plane.shape}")
print(f"Data type: {plane.dtype.name}")
print(f"Min: {plane.min()}, Max: {plane.max()}")
```
### Get Multiple Planes
```python
import numpy as np
# Get Z-stack for specific channel and timepoint
pixels = image.getPrimaryPixels()
c, t = 0, 0 # First channel and timepoint
# Build list of (z, c, t) coordinates
zct_list = [(z, c, t) for z in range(size_z)]
# Get all planes at once
planes = pixels.getPlanes(zct_list)
# Stack into 3D array
z_stack = np.array([p for p in planes])
print(f"Z-stack shape: {z_stack.shape}")
```
### Get Hypercube (Subset of 5D Data)
```python
# Get subset of 5D data (Z, C, T)
zct_list = []
for z in range(size_z // 2, size_z): # Second half of Z
for c in range(size_c): # All channels
for t in range(size_t): # All timepoints
zct_list.append((z, c, t))
# Get planes
planes = pixels.getPlanes(zct_list)
# Process each plane
for i, plane in enumerate(planes):
z, c, t = zct_list[i]
print(f"Plane Z={z}, C={c}, T={t}: Min={plane.min()}, Max={plane.max()}")
```
### Get Tile (Region of Interest)
```python
# Define tile coordinates
x, y = 50, 50 # Top-left corner
width, height = 100, 100 # Tile size
tile = (x, y, width, height)
# Get tile for specific Z, C, T
z, c, t = 0, 0, 0
zct_list = [(z, c, t, tile)]
tiles = pixels.getTiles(zct_list)
tile_data = tiles[0]
print(f"Tile shape: {tile_data.shape}") # Should be (height, width)
```
### Get Multiple Tiles
```python
# Get tiles from Z-stack
c, t = 0, 0
tile = (50, 50, 100, 100) # x, y, width, height
# Build list with tiles
zct_list = [(z, c, t, tile) for z in range(size_z)]
tiles = pixels.getTiles(zct_list)
for i, tile_data in enumerate(tiles):
print(f"Tile Z={i}: {tile_data.shape}, Min={tile_data.min()}")
```
## Image Histograms
### Get Histogram
```python
# Get histogram for first channel
channel_index = 0
num_bins = 256
z, t = 0, 0
histogram = image.getHistogram([channel_index], num_bins, False, z, t)
print(f"Histogram bins: {len(histogram)}")
print(f"First 10 bins: {histogram[:10]}")
```
### Multi-Channel Histogram
```python
# Get histograms for all channels
channels = list(range(image.getSizeC()))
histograms = image.getHistogram(channels, 256, False, 0, 0)
for c, hist in enumerate(histograms):
print(f"Channel {c}: Total pixels = {sum(hist)}")
```
## Image Rendering
### Render Image with Current Settings
```python
from PIL import Image
from io import BytesIO
# Get image
image = conn.getObject("Image", image_id)
# Render at specific Z and T
z = image.getSizeZ() // 2 # Middle Z-section
t = 0
rendered_image = image.renderImage(z, t)
# rendered_image is a PIL Image object
rendered_image.save("rendered_image.jpg")
```
### Get Thumbnail
```python
from PIL import Image
from io import BytesIO
# Get thumbnail (uses current rendering settings)
thumbnail_data = image.getThumbnail()
# Convert to PIL Image
thumbnail = Image.open(BytesIO(thumbnail_data))
thumbnail.save("thumbnail.jpg")
# Get specific thumbnail size
thumbnail_data = image.getThumbnail(size=(96, 96))
thumbnail = Image.open(BytesIO(thumbnail_data))
```
## Rendering Settings
### View Current Settings
```python
# Display rendering settings
print("Current Rendering Settings:")
print(f"Grayscale mode: {image.isGreyscaleRenderingModel()}")
print(f"Default Z: {image.getDefaultZ()}")
print(f"Default T: {image.getDefaultT()}")
print()
# Channel settings
print("Channel Settings:")
for idx, channel in enumerate(image.getChannels()):
print(f"Channel {idx + 1}:")
print(f" Label: {channel.getLabel()}")
print(f" Color: {channel.getColor().getHtml()}")
print(f" Active: {channel.isActive()}")
print(f" Window: {channel.getWindowStart()} - {channel.getWindowEnd()}")
print(f" Min/Max: {channel.getWindowMin()} - {channel.getWindowMax()}")
```
### Set Rendering Model
```python
# Switch to grayscale rendering
image.setGreyscaleRenderingModel()
# Switch to color rendering
image.setColorRenderingModel()
```
### Set Active Channels
```python
# Activate specific channels (1-indexed)
image.setActiveChannels([1, 3]) # Channels 1 and 3 only
# Activate all channels
all_channels = list(range(1, image.getSizeC() + 1))
image.setActiveChannels(all_channels)
# Activate single channel
image.setActiveChannels([2])
```
### Set Channel Colors
```python
# Set channel colors (hex format)
channels = [1, 2, 3]
colors = ['FF0000', '00FF00', '0000FF'] # Red, Green, Blue
image.setActiveChannels(channels, colors=colors)
# Use None to keep existing color
colors = ['FF0000', None, '0000FF'] # Keep channel 2's color
image.setActiveChannels(channels, colors=colors)
```
### Set Channel Window (Intensity Range)
```python
# Set intensity windows for channels
channels = [1, 2]
windows = [
[100.0, 500.0], # Channel 1: 100-500
[50.0, 300.0] # Channel 2: 50-300
]
image.setActiveChannels(channels, windows=windows)
# Use None to keep existing window
windows = [[100.0, 500.0], [None, None]]
image.setActiveChannels(channels, windows=windows)
```
### Set Default Z and T
```python
# Set default Z-section and timepoint
image.setDefaultZ(5)
image.setDefaultT(0)
# Render using defaults
rendered_image = image.renderImage(z=None, t=None)
rendered_image.save("default_rendering.jpg")
```
## Render Individual Channels
### Render Each Channel Separately
```python
# Set grayscale mode
image.setGreyscaleRenderingModel()
z = image.getSizeZ() // 2
t = 0
# Render each channel
for c in range(1, image.getSizeC() + 1):
image.setActiveChannels([c])
rendered = image.renderImage(z, t)
rendered.save(f"channel_{c}.jpg")
```
### Render Multi-Channel Composites
```python
# Color composite of first 3 channels
image.setColorRenderingModel()
channels = [1, 2, 3]
colors = ['FF0000', '00FF00', '0000FF'] # RGB
image.setActiveChannels(channels, colors=colors)
rendered = image.renderImage(z, t)
rendered.save("rgb_composite.jpg")
```
## Image Projections
### Maximum Intensity Projection
```python
# Set projection type
image.setProjection('intmax')
# Render (projects across all Z)
z, t = 0, 0 # Z is ignored for projections
rendered = image.renderImage(z, t)
rendered.save("max_projection.jpg")
# Reset to normal rendering
image.setProjection('normal')
```
### Mean Intensity Projection
```python
image.setProjection('intmean')
rendered = image.renderImage(z, t)
rendered.save("mean_projection.jpg")
image.setProjection('normal')
```
### Available Projection Types
- `'normal'`: No projection (default)
- `'intmax'`: Maximum intensity projection
- `'intmean'`: Mean intensity projection
- `'intmin'`: Minimum intensity projection (if supported)
## Save and Reset Rendering Settings
### Save Current Settings as Default
```python
# Modify rendering settings
image.setActiveChannels([1, 2])
image.setDefaultZ(5)
# Save as new default
image.saveDefaults()
```
### Reset to Import Settings
```python
# Reset to original import settings
image.resetDefaults(save=True)
```
## Create Images from NumPy Arrays
### Create Simple Image
```python
import numpy as np
# Create sample data
size_x, size_y = 512, 512
size_z, size_c, size_t = 10, 2, 1
# Generate planes
def plane_generator():
"""Generator that yields planes"""
for z in range(size_z):
for c in range(size_c):
for t in range(size_t):
# Create synthetic data
plane = np.random.randint(0, 255, (size_y, size_x), dtype=np.uint8)
yield plane
# Create image
image = conn.createImageFromNumpySeq(
plane_generator(),
"Test Image",
size_z, size_c, size_t,
description="Image created from NumPy arrays",
dataset=None
)
print(f"Created image ID: {image.getId()}")
```
### Create Image from Hard-Coded Arrays
```python
from numpy import array, int8
# Define dimensions
size_x, size_y = 5, 4
size_z, size_c, size_t = 1, 2, 1
# Create planes
plane1 = array(
[[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9],
[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]],
dtype=int8
)
plane2 = array(
[[5, 6, 7, 8, 9],
[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9],
[0, 1, 2, 3, 4]],
dtype=int8
)
planes = [plane1, plane2]
def plane_gen():
for p in planes:
yield p
# Create image
desc = "Image created from hard-coded arrays"
image = conn.createImageFromNumpySeq(
plane_gen(),
"numpy_image",
size_z, size_c, size_t,
description=desc,
dataset=None
)
print(f"Created image: {image.getName()} (ID: {image.getId()})")
```
### Create Image in Dataset
```python
# Get target dataset
dataset = conn.getObject("Dataset", dataset_id)
# Create image
image = conn.createImageFromNumpySeq(
plane_generator(),
"New Analysis Result",
size_z, size_c, size_t,
description="Result from analysis pipeline",
dataset=dataset # Add to dataset
)
```
### Create Derived Image
```python
# Get source image
source = conn.getObject("Image", source_image_id)
size_z = source.getSizeZ()
size_c = source.getSizeC()
size_t = source.getSizeT()
dataset = source.getParent()
pixels = source.getPrimaryPixels()
new_size_c = 1 # Average channels
def plane_gen():
"""Average channels together"""
for z in range(size_z):
for c in range(new_size_c):
for t in range(size_t):
# Get multiple channels
channel0 = pixels.getPlane(z, 0, t)
channel1 = pixels.getPlane(z, 1, t)
# Combine
new_plane = (channel0.astype(float) + channel1.astype(float)) / 2
new_plane = new_plane.astype(channel0.dtype)
yield new_plane
# Create new image
desc = "Averaged channels from source image"
derived = conn.createImageFromNumpySeq(
plane_gen(),
f"{source.getName()}_averaged",
size_z, new_size_c, size_t,
description=desc,
dataset=dataset
)
print(f"Created derived image: {derived.getId()}")
```
## Set Physical Dimensions
### Set Pixel Sizes with Units
```python
from omero.model.enums import UnitsLength
import omero.model
# Get image
image = conn.getObject("Image", image_id)
# Create unit objects
pixel_size_x = omero.model.LengthI(0.325, UnitsLength.MICROMETER)
pixel_size_y = omero.model.LengthI(0.325, UnitsLength.MICROMETER)
pixel_size_z = omero.model.LengthI(1.0, UnitsLength.MICROMETER)
# Get pixels object
pixels = image.getPrimaryPixels()._obj
# Set physical sizes
pixels.setPhysicalSizeX(pixel_size_x)
pixels.setPhysicalSizeY(pixel_size_y)
pixels.setPhysicalSizeZ(pixel_size_z)
# Save changes
conn.getUpdateService().saveObject(pixels)
print("Updated pixel dimensions")
```
### Available Length Units
From `omero.model.enums.UnitsLength`:
- `ANGSTROM`
- `NANOMETER`
- `MICROMETER`
- `MILLIMETER`
- `CENTIMETER`
- `METER`
- `PIXEL`
### Set Pixel Size on New Image
```python
from omero.model.enums import UnitsLength
import omero.model
# Create image
image = conn.createImageFromNumpySeq(
plane_generator(),
"New Image with Dimensions",
size_z, size_c, size_t
)
# Set pixel sizes
pixel_size = omero.model.LengthI(0.5, UnitsLength.MICROMETER)
pixels = image.getPrimaryPixels()._obj
pixels.setPhysicalSizeX(pixel_size)
pixels.setPhysicalSizeY(pixel_size)
z_size = omero.model.LengthI(2.0, UnitsLength.MICROMETER)
pixels.setPhysicalSizeZ(z_size)
conn.getUpdateService().saveObject(pixels)
```
## Complete Example: Image Processing Pipeline
```python
from omero.gateway import BlitzGateway
import numpy as np
HOST = 'omero.example.com'
PORT = 4064
USERNAME = 'user'
PASSWORD = 'pass'
with BlitzGateway(USERNAME, PASSWORD, host=HOST, port=PORT) as conn:
# Get source image
source = conn.getObject("Image", source_image_id)
print(f"Processing: {source.getName()}")
# Get dimensions
size_x = source.getSizeX()
size_y = source.getSizeY()
size_z = source.getSizeZ()
size_c = source.getSizeC()
size_t = source.getSizeT()
pixels = source.getPrimaryPixels()
# Process: Maximum intensity projection over Z
def plane_gen():
for c in range(size_c):
for t in range(size_t):
# Get all Z planes for this C, T
z_stack = []
for z in range(size_z):
plane = pixels.getPlane(z, c, t)
z_stack.append(plane)
# Maximum projection
max_proj = np.max(z_stack, axis=0)
yield max_proj
# Create result image (single Z-section)
result = conn.createImageFromNumpySeq(
plane_gen(),
f"{source.getName()}_MIP",
1, size_c, size_t, # Z=1 for projection
description="Maximum intensity projection",
dataset=source.getParent()
)
print(f"Created MIP image: {result.getId()}")
# Copy pixel sizes (X and Y only, no Z for projection)
from omero.model.enums import UnitsLength
import omero.model
source_pixels = source.getPrimaryPixels()._obj
result_pixels = result.getPrimaryPixels()._obj
result_pixels.setPhysicalSizeX(source_pixels.getPhysicalSizeX())
result_pixels.setPhysicalSizeY(source_pixels.getPhysicalSizeY())
conn.getUpdateService().saveObject(result_pixels)
print("Processing complete")
```
## Working with Different Data Types
### Handle Various Pixel Types
```python
# Get pixel type
pixel_type = image.getPixelsType()
print(f"Pixel type: {pixel_type}")
# Common types: uint8, uint16, uint32, int8, int16, int32, float, double
# Get plane with correct dtype
plane = pixels.getPlane(z, c, t)
print(f"NumPy dtype: {plane.dtype}")
# Convert if needed for processing
if plane.dtype == np.uint16:
# Convert to float for processing
plane_float = plane.astype(np.float32)
# Process...
# Convert back
result = plane_float.astype(np.uint16)
```
### Handle Large Images
```python
# Process large images in tiles to save memory
tile_size = 512
size_x = image.getSizeX()
size_y = image.getSizeY()
for y in range(0, size_y, tile_size):
for x in range(0, size_x, tile_size):
# Get tile dimensions
w = min(tile_size, size_x - x)
h = min(tile_size, size_y - y)
tile = (x, y, w, h)
# Get tile data
zct_list = [(z, c, t, tile)]
tile_data = pixels.getTiles(zct_list)[0]
# Process tile
# ...
```
## Best Practices
1. **Use Generators**: For creating images, use generators to avoid loading all data in memory
2. **Specify Data Types**: Match NumPy dtypes to OMERO pixel types
3. **Set Physical Dimensions**: Always set pixel sizes for new images
4. **Tile Large Images**: Process large images in tiles to manage memory
5. **Close Connections**: Always close connections when done
6. **Rendering Efficiency**: Cache rendering settings when rendering multiple images
7. **Channel Indexing**: Remember channels are 1-indexed for rendering, 0-indexed for pixel access
8. **Save Settings**: Save rendering settings if they should be permanent
9. **Compression**: Use compression parameter in renderImage() for faster transfers
10. **Error Handling**: Check for None returns and handle exceptions

View File

@@ -0,0 +1,688 @@
# Metadata & Annotations
This reference covers creating and managing annotations in OMERO, including tags, key-value pairs, file attachments, and comments.
## Annotation Types
OMERO supports several annotation types:
- **TagAnnotation**: Text labels for categorization
- **MapAnnotation**: Key-value pairs for structured metadata
- **FileAnnotation**: File attachments (PDFs, CSVs, analysis results, etc.)
- **CommentAnnotation**: Free-text comments
- **LongAnnotation**: Integer values
- **DoubleAnnotation**: Floating-point values
- **BooleanAnnotation**: Boolean values
- **TimestampAnnotation**: Date/time stamps
- **TermAnnotation**: Ontology terms
## Tag Annotations
### Create and Link Tag
```python
import omero.gateway
# Create new tag
tag_ann = omero.gateway.TagAnnotationWrapper(conn)
tag_ann.setValue("Experiment 2024")
tag_ann.setDescription("Optional description of this tag")
tag_ann.save()
# Link tag to an object
project = conn.getObject("Project", project_id)
project.linkAnnotation(tag_ann)
```
### Create Tag with Namespace
```python
# Create tag with custom namespace
tag_ann = omero.gateway.TagAnnotationWrapper(conn)
tag_ann.setValue("Quality Control")
tag_ann.setNs("mylab.qc.tags")
tag_ann.save()
# Link to image
image = conn.getObject("Image", image_id)
image.linkAnnotation(tag_ann)
```
### Reuse Existing Tag
```python
# Find existing tag
tag_id = 123
tag_ann = conn.getObject("TagAnnotation", tag_id)
# Link to multiple images
for image in conn.getObjects("Image", [img1, img2, img3]):
image.linkAnnotation(tag_ann)
```
### Create Tag Set (Tag with Children)
```python
# Create tag set (parent tag)
tag_set = omero.gateway.TagAnnotationWrapper(conn)
tag_set.setValue("Cell Types")
tag_set.save()
# Create child tags
tags = ["HeLa", "U2OS", "HEK293"]
for tag_value in tags:
tag = omero.gateway.TagAnnotationWrapper(conn)
tag.setValue(tag_value)
tag.save()
# Link child to parent
tag_set.linkAnnotation(tag)
```
## Map Annotations (Key-Value Pairs)
### Create Map Annotation
```python
import omero.gateway
import omero.constants.metadata
# Prepare key-value data
key_value_data = [
["Drug Name", "Monastrol"],
["Concentration", "5 mg/ml"],
["Treatment Time", "24 hours"],
["Temperature", "37C"]
]
# Create map annotation
map_ann = omero.gateway.MapAnnotationWrapper(conn)
# Use standard client namespace
namespace = omero.constants.metadata.NSCLIENTMAPANNOTATION
map_ann.setNs(namespace)
# Set data
map_ann.setValue(key_value_data)
map_ann.save()
# Link to dataset
dataset = conn.getObject("Dataset", dataset_id)
dataset.linkAnnotation(map_ann)
```
### Custom Namespace for Map Annotations
```python
# Use custom namespace for organization-specific metadata
key_value_data = [
["Microscope", "Zeiss LSM 880"],
["Objective", "63x Oil"],
["Laser Power", "10%"]
]
map_ann = omero.gateway.MapAnnotationWrapper(conn)
map_ann.setNs("mylab.microscopy.settings")
map_ann.setValue(key_value_data)
map_ann.save()
image = conn.getObject("Image", image_id)
image.linkAnnotation(map_ann)
```
### Read Map Annotation
```python
# Get map annotation
image = conn.getObject("Image", image_id)
for ann in image.listAnnotations():
if isinstance(ann, omero.gateway.MapAnnotationWrapper):
print(f"Map Annotation (ID: {ann.getId()}):")
print(f"Namespace: {ann.getNs()}")
# Get key-value pairs
for key, value in ann.getValue():
print(f" {key}: {value}")
```
## File Annotations
### Upload and Attach File
```python
import os
# Prepare file
file_path = "analysis_results.csv"
# Create file annotation
namespace = "mylab.analysis.results"
file_ann = conn.createFileAnnfromLocalFile(
file_path,
mimetype="text/csv",
ns=namespace,
desc="Cell segmentation results"
)
# Link to dataset
dataset = conn.getObject("Dataset", dataset_id)
dataset.linkAnnotation(file_ann)
```
### Supported MIME Types
Common MIME types:
- Text: `"text/plain"`, `"text/csv"`, `"text/tab-separated-values"`
- Documents: `"application/pdf"`, `"application/vnd.ms-excel"`
- Images: `"image/png"`, `"image/jpeg"`
- Data: `"application/json"`, `"application/xml"`
- Archives: `"application/zip"`, `"application/gzip"`
### Upload Multiple Files
```python
files = ["figure1.pdf", "figure2.pdf", "table1.csv"]
namespace = "publication.supplementary"
dataset = conn.getObject("Dataset", dataset_id)
for file_path in files:
file_ann = conn.createFileAnnfromLocalFile(
file_path,
mimetype="application/octet-stream",
ns=namespace,
desc=f"Supplementary file: {os.path.basename(file_path)}"
)
dataset.linkAnnotation(file_ann)
```
### Download File Annotation
```python
import os
# Get object with file annotation
image = conn.getObject("Image", image_id)
# Download directory
download_path = "./downloads"
os.makedirs(download_path, exist_ok=True)
# Filter by namespace
namespace = "mylab.analysis.results"
for ann in image.listAnnotations(ns=namespace):
if isinstance(ann, omero.gateway.FileAnnotationWrapper):
file_name = ann.getFile().getName()
file_path = os.path.join(download_path, file_name)
print(f"Downloading: {file_name}")
# Download file in chunks
with open(file_path, 'wb') as f:
for chunk in ann.getFileInChunks():
f.write(chunk)
print(f"Saved to: {file_path}")
```
### Get File Annotation Metadata
```python
for ann in dataset.listAnnotations():
if isinstance(ann, omero.gateway.FileAnnotationWrapper):
orig_file = ann.getFile()
print(f"File Annotation ID: {ann.getId()}")
print(f" File Name: {orig_file.getName()}")
print(f" File Size: {orig_file.getSize()} bytes")
print(f" MIME Type: {orig_file.getMimetype()}")
print(f" Namespace: {ann.getNs()}")
print(f" Description: {ann.getDescription()}")
```
## Comment Annotations
### Add Comment
```python
# Create comment
comment = omero.gateway.CommentAnnotationWrapper(conn)
comment.setValue("This image shows excellent staining quality")
comment.save()
# Link to image
image = conn.getObject("Image", image_id)
image.linkAnnotation(comment)
```
### Add Comment with Namespace
```python
comment = omero.gateway.CommentAnnotationWrapper(conn)
comment.setValue("Approved for publication")
comment.setNs("mylab.publication.status")
comment.save()
dataset = conn.getObject("Dataset", dataset_id)
dataset.linkAnnotation(comment)
```
## Numeric Annotations
### Long Annotation (Integer)
```python
# Create long annotation
long_ann = omero.gateway.LongAnnotationWrapper(conn)
long_ann.setValue(42)
long_ann.setNs("mylab.cell.count")
long_ann.save()
image = conn.getObject("Image", image_id)
image.linkAnnotation(long_ann)
```
### Double Annotation (Float)
```python
# Create double annotation
double_ann = omero.gateway.DoubleAnnotationWrapper(conn)
double_ann.setValue(3.14159)
double_ann.setNs("mylab.fluorescence.intensity")
double_ann.save()
image = conn.getObject("Image", image_id)
image.linkAnnotation(double_ann)
```
## Listing Annotations
### List All Annotations on Object
```python
import omero.model
# Get object
project = conn.getObject("Project", project_id)
# List all annotations
for ann in project.listAnnotations():
print(f"Annotation ID: {ann.getId()}")
print(f" Type: {ann.OMERO_TYPE}")
print(f" Added by: {ann.link.getDetails().getOwner().getOmeName()}")
# Type-specific handling
if ann.OMERO_TYPE == omero.model.TagAnnotationI:
print(f" Tag value: {ann.getTextValue()}")
elif isinstance(ann, omero.gateway.MapAnnotationWrapper):
print(f" Map data: {ann.getValue()}")
elif isinstance(ann, omero.gateway.FileAnnotationWrapper):
print(f" File: {ann.getFile().getName()}")
elif isinstance(ann, omero.gateway.CommentAnnotationWrapper):
print(f" Comment: {ann.getValue()}")
print()
```
### Filter Annotations by Namespace
```python
# Get annotations with specific namespace
namespace = "mylab.qc.tags"
for ann in image.listAnnotations(ns=namespace):
print(f"Found annotation: {ann.getId()}")
if isinstance(ann, omero.gateway.MapAnnotationWrapper):
for key, value in ann.getValue():
print(f" {key}: {value}")
```
### Get First Annotation with Namespace
```python
# Get single annotation by namespace
namespace = "mylab.analysis.results"
ann = dataset.getAnnotation(namespace)
if ann:
print(f"Found annotation with namespace: {ann.getNs()}")
else:
print("No annotation found with that namespace")
```
### Query Annotations Across Multiple Objects
```python
# Get all tag annotations linked to image IDs
image_ids = [1, 2, 3, 4, 5]
for link in conn.getAnnotationLinks('Image', parent_ids=image_ids):
ann = link.getChild()
if isinstance(ann._obj, omero.model.TagAnnotationI):
print(f"Image {link.getParent().getId()}: Tag '{ann.getTextValue()}'")
```
## Counting Annotations
```python
# Count annotations on project
project_id = 123
count = conn.countAnnotations('Project', [project_id])
print(f"Project has {count[project_id]} annotations")
# Count annotations on multiple images
image_ids = [1, 2, 3]
counts = conn.countAnnotations('Image', image_ids)
for image_id, count in counts.items():
print(f"Image {image_id}: {count} annotations")
```
## Annotation Links
### Create Annotation Link Manually
```python
# Get annotation and image
tag = conn.getObject("TagAnnotation", tag_id)
image = conn.getObject("Image", image_id)
# Create link
link = omero.model.ImageAnnotationLinkI()
link.setParent(omero.model.ImageI(image.getId(), False))
link.setChild(omero.model.TagAnnotationI(tag.getId(), False))
# Save link
conn.getUpdateService().saveAndReturnObject(link)
```
### Update Annotation Links
```python
# Get existing links
annotation_ids = [1, 2, 3]
new_tag_id = 5
for link in conn.getAnnotationLinks('Image', ann_ids=annotation_ids):
print(f"Image ID: {link.getParent().id}")
# Change linked annotation
link._obj.child = omero.model.TagAnnotationI(new_tag_id, False)
link.save()
```
## Removing Annotations
### Delete Annotations
```python
# Get image
image = conn.getObject("Image", image_id)
# Collect annotation IDs to delete
to_delete = []
namespace = "mylab.temp.annotations"
for ann in image.listAnnotations(ns=namespace):
to_delete.append(ann.getId())
# Delete annotations
if to_delete:
conn.deleteObjects('Annotation', to_delete, wait=True)
print(f"Deleted {len(to_delete)} annotations")
```
### Unlink Annotations (Keep Annotation, Remove Link)
```python
# Get image
image = conn.getObject("Image", image_id)
# Collect link IDs to delete
to_delete = []
for ann in image.listAnnotations():
if isinstance(ann, omero.gateway.TagAnnotationWrapper):
to_delete.append(ann.link.getId())
# Delete links (annotations remain in database)
if to_delete:
conn.deleteObjects("ImageAnnotationLink", to_delete, wait=True)
print(f"Unlinked {len(to_delete)} annotations")
```
### Delete Specific Annotation Types
```python
import omero.gateway
# Delete only map annotations
image = conn.getObject("Image", image_id)
to_delete = []
for ann in image.listAnnotations():
if isinstance(ann, omero.gateway.MapAnnotationWrapper):
to_delete.append(ann.getId())
conn.deleteObjects('Annotation', to_delete, wait=True)
```
## Annotation Ownership
### Set Annotation Owner (Admin Only)
```python
import omero.model
# Create tag with specific owner
tag_ann = omero.gateway.TagAnnotationWrapper(conn)
tag_ann.setValue("Admin Tag")
# Set owner (requires admin privileges)
user_id = 5
tag_ann._obj.details.owner = omero.model.ExperimenterI(user_id, False)
tag_ann.save()
```
### Create Annotation as Another User (Admin Only)
```python
# Admin connection
admin_conn = BlitzGateway(admin_user, admin_pass, host=host, port=4064)
admin_conn.connect()
# Get target user
user_id = 10
user = admin_conn.getObject("Experimenter", user_id).getName()
# Create connection as user
user_conn = admin_conn.suConn(user)
# Create annotation as that user
map_ann = omero.gateway.MapAnnotationWrapper(user_conn)
map_ann.setNs("mylab.metadata")
map_ann.setValue([["key", "value"]])
map_ann.save()
# Link to project
project = admin_conn.getObject("Project", project_id)
project.linkAnnotation(map_ann)
# Close connections
user_conn.close()
admin_conn.close()
```
## Bulk Annotation Operations
### Tag Multiple Images
```python
# Create or get tag
tag = omero.gateway.TagAnnotationWrapper(conn)
tag.setValue("Validated")
tag.save()
# Get images to tag
dataset = conn.getObject("Dataset", dataset_id)
# Tag all images in dataset
for image in dataset.listChildren():
image.linkAnnotation(tag)
print(f"Tagged image: {image.getName()}")
```
### Batch Add Map Annotations
```python
# Prepare metadata for multiple images
image_metadata = {
101: [["Quality", "Good"], ["Reviewed", "Yes"]],
102: [["Quality", "Excellent"], ["Reviewed", "Yes"]],
103: [["Quality", "Poor"], ["Reviewed", "No"]]
}
# Add annotations
for image_id, kv_data in image_metadata.items():
image = conn.getObject("Image", image_id)
if image:
map_ann = omero.gateway.MapAnnotationWrapper(conn)
map_ann.setNs("mylab.qc")
map_ann.setValue(kv_data)
map_ann.save()
image.linkAnnotation(map_ann)
print(f"Annotated image {image_id}")
```
## Namespaces
### Standard OMERO Namespaces
```python
import omero.constants.metadata as omero_ns
# Client map annotation namespace
omero_ns.NSCLIENTMAPANNOTATION
# "openmicroscopy.org/omero/client/mapAnnotation"
# Bulk annotations namespace
omero_ns.NSBULKANNOTATIONS
# "openmicroscopy.org/omero/bulk_annotations"
```
### Custom Namespaces
Best practices for custom namespaces:
- Use reverse domain notation: `"org.mylab.category.subcategory"`
- Be specific: `"com.company.project.analysis.v1"`
- Include version if schema may change: `"mylab.metadata.v2"`
```python
# Define namespaces
NS_QC = "org.mylab.quality_control"
NS_ANALYSIS = "org.mylab.image_analysis.v1"
NS_PUBLICATION = "org.mylab.publication.2024"
# Use in annotations
map_ann.setNs(NS_ANALYSIS)
```
## Load All Annotations by Type
### Load All File Annotations
```python
# Define namespaces to include/exclude
ns_to_include = ["mylab.analysis.results"]
ns_to_exclude = []
# Get metadata service
metadataService = conn.getMetadataService()
# Load all file annotations with namespace
annotations = metadataService.loadSpecifiedAnnotations(
'omero.model.FileAnnotation',
ns_to_include,
ns_to_exclude,
None
)
for ann in annotations:
print(f"File Annotation ID: {ann.getId().getValue()}")
print(f" File: {ann.getFile().getName().getValue()}")
print(f" Size: {ann.getFile().getSize().getValue()} bytes")
```
## Complete Example
```python
from omero.gateway import BlitzGateway
import omero.gateway
import omero.constants.metadata
HOST = 'omero.example.com'
PORT = 4064
USERNAME = 'user'
PASSWORD = 'pass'
with BlitzGateway(USERNAME, PASSWORD, host=HOST, port=PORT) as conn:
# Get dataset
dataset = conn.getObject("Dataset", dataset_id)
# Add tag
tag = omero.gateway.TagAnnotationWrapper(conn)
tag.setValue("Analysis Complete")
tag.save()
dataset.linkAnnotation(tag)
# Add map annotation with metadata
metadata = [
["Analysis Date", "2024-10-20"],
["Software", "CellProfiler 4.2"],
["Pipeline", "cell_segmentation_v3"]
]
map_ann = omero.gateway.MapAnnotationWrapper(conn)
map_ann.setNs(omero.constants.metadata.NSCLIENTMAPANNOTATION)
map_ann.setValue(metadata)
map_ann.save()
dataset.linkAnnotation(map_ann)
# Add file annotation
file_ann = conn.createFileAnnfromLocalFile(
"analysis_summary.pdf",
mimetype="application/pdf",
ns="mylab.reports",
desc="Analysis summary report"
)
dataset.linkAnnotation(file_ann)
# Add comment
comment = omero.gateway.CommentAnnotationWrapper(conn)
comment.setValue("Dataset ready for review")
comment.save()
dataset.linkAnnotation(comment)
print(f"Added 4 annotations to dataset {dataset.getName()}")
```
## Best Practices
1. **Use Namespaces**: Always use namespaces to organize annotations
2. **Descriptive Tags**: Use clear, consistent tag names
3. **Structured Metadata**: Prefer map annotations over comments for structured data
4. **File Organization**: Use descriptive filenames and MIME types
5. **Link Reuse**: Reuse existing tags instead of creating duplicates
6. **Batch Operations**: Process multiple objects in loops for efficiency
7. **Error Handling**: Check for successful saves before linking
8. **Cleanup**: Remove temporary annotations when no longer needed
9. **Documentation**: Document custom namespace meanings
10. **Permissions**: Consider annotation ownership for collaborative workflows

View File

@@ -0,0 +1,648 @@
# Regions of Interest (ROIs)
This reference covers creating, retrieving, and analyzing ROIs in OMERO.
## ROI Overview
ROIs (Regions of Interest) in OMERO are containers for geometric shapes that mark specific regions on images. Each ROI can contain multiple shapes, and shapes can be specific to Z-sections and timepoints.
### Supported Shape Types
- **Rectangle**: Rectangular regions
- **Ellipse**: Circular and elliptical regions
- **Line**: Line segments
- **Point**: Single points
- **Polygon**: Multi-point polygons
- **Mask**: Pixel-based masks
- **Polyline**: Multi-segment lines
## Creating ROIs
### Helper Functions
```python
from omero.rtypes import rdouble, rint, rstring
import omero.model
def create_roi(conn, image, shapes):
"""
Create an ROI and link it to shapes.
Args:
conn: BlitzGateway connection
image: Image object
shapes: List of shape objects
Returns:
Saved ROI object
"""
roi = omero.model.RoiI()
roi.setImage(image._obj)
for shape in shapes:
roi.addShape(shape)
updateService = conn.getUpdateService()
return updateService.saveAndReturnObject(roi)
def rgba_to_int(red, green, blue, alpha=255):
"""
Convert RGBA values (0-255) to integer encoding for OMERO.
Args:
red, green, blue, alpha: Color values (0-255)
Returns:
Integer color value
"""
return int.from_bytes([red, green, blue, alpha],
byteorder='big', signed=True)
```
### Rectangle ROI
```python
from omero.rtypes import rdouble, rint, rstring
import omero.model
# Get image
image = conn.getObject("Image", image_id)
# Define position and size
x, y = 50, 100
width, height = 200, 150
z, t = 0, 0 # Z-section and timepoint
# Create rectangle
rect = omero.model.RectangleI()
rect.x = rdouble(x)
rect.y = rdouble(y)
rect.width = rdouble(width)
rect.height = rdouble(height)
rect.theZ = rint(z)
rect.theT = rint(t)
# Set label and colors
rect.textValue = rstring("Cell Region")
rect.fillColor = rint(rgba_to_int(255, 0, 0, 50)) # Red, semi-transparent
rect.strokeColor = rint(rgba_to_int(255, 255, 0, 255)) # Yellow border
# Create ROI
roi = create_roi(conn, image, [rect])
print(f"Created ROI ID: {roi.getId().getValue()}")
```
### Ellipse ROI
```python
# Center position and radii
center_x, center_y = 250, 250
radius_x, radius_y = 100, 75
z, t = 0, 0
# Create ellipse
ellipse = omero.model.EllipseI()
ellipse.x = rdouble(center_x)
ellipse.y = rdouble(center_y)
ellipse.radiusX = rdouble(radius_x)
ellipse.radiusY = rdouble(radius_y)
ellipse.theZ = rint(z)
ellipse.theT = rint(t)
ellipse.textValue = rstring("Nucleus")
ellipse.fillColor = rint(rgba_to_int(0, 255, 0, 50))
# Create ROI
roi = create_roi(conn, image, [ellipse])
```
### Line ROI
```python
# Line endpoints
x1, y1 = 100, 100
x2, y2 = 300, 200
z, t = 0, 0
# Create line
line = omero.model.LineI()
line.x1 = rdouble(x1)
line.y1 = rdouble(y1)
line.x2 = rdouble(x2)
line.y2 = rdouble(y2)
line.theZ = rint(z)
line.theT = rint(t)
line.textValue = rstring("Measurement Line")
line.strokeColor = rint(rgba_to_int(0, 0, 255, 255))
# Create ROI
roi = create_roi(conn, image, [line])
```
### Point ROI
```python
# Point position
x, y = 150, 150
z, t = 0, 0
# Create point
point = omero.model.PointI()
point.x = rdouble(x)
point.y = rdouble(y)
point.theZ = rint(z)
point.theT = rint(t)
point.textValue = rstring("Feature Point")
# Create ROI
roi = create_roi(conn, image, [point])
```
### Polygon ROI
```python
from omero.model.enums import UnitsLength
# Define vertices as string "x1,y1 x2,y2 x3,y3 ..."
vertices = "10,20 50,150 200,200 250,75"
z, t = 0, 0
# Create polygon
polygon = omero.model.PolygonI()
polygon.points = rstring(vertices)
polygon.theZ = rint(z)
polygon.theT = rint(t)
polygon.textValue = rstring("Cell Outline")
# Set colors and stroke width
polygon.fillColor = rint(rgba_to_int(255, 0, 255, 50))
polygon.strokeColor = rint(rgba_to_int(255, 255, 0, 255))
polygon.strokeWidth = omero.model.LengthI(2, UnitsLength.PIXEL)
# Create ROI
roi = create_roi(conn, image, [polygon])
```
### Mask ROI
```python
import numpy as np
import struct
import math
def create_mask_bytes(mask_array, bytes_per_pixel=1):
"""
Convert binary mask array to bit-packed bytes for OMERO.
Args:
mask_array: Binary numpy array (0s and 1s)
bytes_per_pixel: 1 or 2
Returns:
Byte array for OMERO mask
"""
if bytes_per_pixel == 2:
divider = 16.0
format_string = "H"
byte_factor = 0.5
elif bytes_per_pixel == 1:
divider = 8.0
format_string = "B"
byte_factor = 1
else:
raise ValueError("bytes_per_pixel must be 1 or 2")
mask_bytes = mask_array.astype(np.uint8).tobytes()
steps = math.ceil(len(mask_bytes) / divider)
packed_mask = []
for i in range(int(steps)):
binary = mask_bytes[i * int(divider):
i * int(divider) + int(divider)]
format_str = str(int(byte_factor * len(binary))) + format_string
binary = struct.unpack(format_str, binary)
s = "".join(str(bit) for bit in binary)
packed_mask.append(int(s, 2))
return bytearray(packed_mask)
# Create binary mask (1s and 0s)
mask_w, mask_h = 100, 100
mask_array = np.fromfunction(
lambda x, y: ((x - 50)**2 + (y - 50)**2) < 40**2, # Circle
(mask_w, mask_h)
)
# Pack mask
mask_packed = create_mask_bytes(mask_array, bytes_per_pixel=1)
# Mask position
mask_x, mask_y = 50, 50
z, t, c = 0, 0, 0
# Create mask
mask = omero.model.MaskI()
mask.setX(rdouble(mask_x))
mask.setY(rdouble(mask_y))
mask.setWidth(rdouble(mask_w))
mask.setHeight(rdouble(mask_h))
mask.setTheZ(rint(z))
mask.setTheT(rint(t))
mask.setTheC(rint(c))
mask.setBytes(mask_packed)
mask.textValue = rstring("Segmentation Mask")
# Set color
from omero.gateway import ColorHolder
mask_color = ColorHolder()
mask_color.setRed(255)
mask_color.setGreen(0)
mask_color.setBlue(0)
mask_color.setAlpha(100)
mask.setFillColor(rint(mask_color.getInt()))
# Create ROI
roi = create_roi(conn, image, [mask])
```
## Multiple Shapes in One ROI
```python
# Create multiple shapes for the same ROI
shapes = []
# Rectangle
rect = omero.model.RectangleI()
rect.x = rdouble(100)
rect.y = rdouble(100)
rect.width = rdouble(50)
rect.height = rdouble(50)
rect.theZ = rint(0)
rect.theT = rint(0)
shapes.append(rect)
# Ellipse
ellipse = omero.model.EllipseI()
ellipse.x = rdouble(125)
ellipse.y = rdouble(125)
ellipse.radiusX = rdouble(20)
ellipse.radiusY = rdouble(20)
ellipse.theZ = rint(0)
ellipse.theT = rint(0)
shapes.append(ellipse)
# Create single ROI with both shapes
roi = create_roi(conn, image, shapes)
```
## Retrieving ROIs
### Get All ROIs for Image
```python
# Get ROI service
roi_service = conn.getRoiService()
# Find all ROIs for image
result = roi_service.findByImage(image_id, None)
print(f"Found {len(result.rois)} ROIs")
for roi in result.rois:
print(f"ROI ID: {roi.getId().getValue()}")
print(f" Number of shapes: {len(roi.copyShapes())}")
```
### Parse ROI Shapes
```python
import omero.model
result = roi_service.findByImage(image_id, None)
for roi in result.rois:
roi_id = roi.getId().getValue()
print(f"ROI ID: {roi_id}")
for shape in roi.copyShapes():
shape_id = shape.getId().getValue()
z = shape.getTheZ().getValue() if shape.getTheZ() else None
t = shape.getTheT().getValue() if shape.getTheT() else None
# Get label
label = ""
if shape.getTextValue():
label = shape.getTextValue().getValue()
print(f" Shape ID: {shape_id}, Z: {z}, T: {t}, Label: {label}")
# Type-specific parsing
if isinstance(shape, omero.model.RectangleI):
x = shape.getX().getValue()
y = shape.getY().getValue()
width = shape.getWidth().getValue()
height = shape.getHeight().getValue()
print(f" Rectangle: ({x}, {y}) {width}x{height}")
elif isinstance(shape, omero.model.EllipseI):
x = shape.getX().getValue()
y = shape.getY().getValue()
rx = shape.getRadiusX().getValue()
ry = shape.getRadiusY().getValue()
print(f" Ellipse: center ({x}, {y}), radii ({rx}, {ry})")
elif isinstance(shape, omero.model.PointI):
x = shape.getX().getValue()
y = shape.getY().getValue()
print(f" Point: ({x}, {y})")
elif isinstance(shape, omero.model.LineI):
x1 = shape.getX1().getValue()
y1 = shape.getY1().getValue()
x2 = shape.getX2().getValue()
y2 = shape.getY2().getValue()
print(f" Line: ({x1}, {y1}) to ({x2}, {y2})")
elif isinstance(shape, omero.model.PolygonI):
points = shape.getPoints().getValue()
print(f" Polygon: {points}")
elif isinstance(shape, omero.model.MaskI):
x = shape.getX().getValue()
y = shape.getY().getValue()
width = shape.getWidth().getValue()
height = shape.getHeight().getValue()
print(f" Mask: ({x}, {y}) {width}x{height}")
```
## Analyzing ROI Intensities
### Get Statistics for ROI Shapes
```python
# Get all shapes from ROIs
roi_service = conn.getRoiService()
result = roi_service.findByImage(image_id, None)
shape_ids = []
for roi in result.rois:
for shape in roi.copyShapes():
shape_ids.append(shape.id.val)
# Define position
z, t = 0, 0
channel_index = 0
# Get statistics
stats = roi_service.getShapeStatsRestricted(
shape_ids, z, t, [channel_index]
)
# Display statistics
for i, stat in enumerate(stats):
shape_id = shape_ids[i]
print(f"Shape {shape_id} statistics:")
print(f" Points Count: {stat.pointsCount[channel_index]}")
print(f" Min: {stat.min[channel_index]}")
print(f" Mean: {stat.mean[channel_index]}")
print(f" Max: {stat.max[channel_index]}")
print(f" Sum: {stat.sum[channel_index]}")
print(f" Std Dev: {stat.stdDev[channel_index]}")
```
### Extract Pixel Values Within ROI
```python
import numpy as np
# Get image and ROI
image = conn.getObject("Image", image_id)
result = roi_service.findByImage(image_id, None)
# Get first rectangle shape
roi = result.rois[0]
rect = roi.copyShapes()[0]
# Get rectangle bounds
x = int(rect.getX().getValue())
y = int(rect.getY().getValue())
width = int(rect.getWidth().getValue())
height = int(rect.getHeight().getValue())
z = rect.getTheZ().getValue()
t = rect.getTheT().getValue()
# Get pixel data
pixels = image.getPrimaryPixels()
# Extract region for each channel
for c in range(image.getSizeC()):
# Get plane
plane = pixels.getPlane(z, c, t)
# Extract ROI region
roi_region = plane[y:y+height, x:x+width]
print(f"Channel {c}:")
print(f" Mean intensity: {np.mean(roi_region)}")
print(f" Max intensity: {np.max(roi_region)}")
```
## Modifying ROIs
### Update Shape Properties
```python
# Get ROI and shape
result = roi_service.findByImage(image_id, None)
roi = result.rois[0]
shape = roi.copyShapes()[0]
# Modify shape (example: change rectangle size)
if isinstance(shape, omero.model.RectangleI):
shape.setWidth(rdouble(150))
shape.setHeight(rdouble(100))
shape.setTextValue(rstring("Updated Rectangle"))
# Save changes
updateService = conn.getUpdateService()
updated_roi = updateService.saveAndReturnObject(roi._obj)
```
### Remove Shape from ROI
```python
result = roi_service.findByImage(image_id, None)
for roi in result.rois:
for shape in roi.copyShapes():
# Check condition (e.g., remove by label)
if (shape.getTextValue() and
shape.getTextValue().getValue() == "test-Ellipse"):
print(f"Removing shape {shape.getId().getValue()}")
roi.removeShape(shape)
# Save modified ROI
updateService = conn.getUpdateService()
roi = updateService.saveAndReturnObject(roi)
```
## Deleting ROIs
### Delete Single ROI
```python
# Delete ROI by ID
roi_id = 123
conn.deleteObjects("Roi", [roi_id], wait=True)
print(f"Deleted ROI {roi_id}")
```
### Delete All ROIs for Image
```python
# Get all ROI IDs for image
result = roi_service.findByImage(image_id, None)
roi_ids = [roi.getId().getValue() for roi in result.rois]
# Delete all
if roi_ids:
conn.deleteObjects("Roi", roi_ids, wait=True)
print(f"Deleted {len(roi_ids)} ROIs")
```
## Batch ROI Creation
### Create ROIs for Multiple Images
```python
# Get images
dataset = conn.getObject("Dataset", dataset_id)
for image in dataset.listChildren():
# Create rectangle at center of each image
x = image.getSizeX() // 2 - 50
y = image.getSizeY() // 2 - 50
rect = omero.model.RectangleI()
rect.x = rdouble(x)
rect.y = rdouble(y)
rect.width = rdouble(100)
rect.height = rdouble(100)
rect.theZ = rint(0)
rect.theT = rint(0)
rect.textValue = rstring("Auto ROI")
roi = create_roi(conn, image, [rect])
print(f"Created ROI for image {image.getName()}")
```
### Create ROIs Across Z-Stack
```python
image = conn.getObject("Image", image_id)
size_z = image.getSizeZ()
# Create rectangle on each Z-section
shapes = []
for z in range(size_z):
rect = omero.model.RectangleI()
rect.x = rdouble(100)
rect.y = rdouble(100)
rect.width = rdouble(50)
rect.height = rdouble(50)
rect.theZ = rint(z)
rect.theT = rint(0)
shapes.append(rect)
# Single ROI with shapes across Z
roi = create_roi(conn, image, shapes)
```
## Complete Example
```python
from omero.gateway import BlitzGateway
from omero.rtypes import rdouble, rint, rstring
import omero.model
HOST = 'omero.example.com'
PORT = 4064
USERNAME = 'user'
PASSWORD = 'pass'
def rgba_to_int(r, g, b, a=255):
return int.from_bytes([r, g, b, a], byteorder='big', signed=True)
with BlitzGateway(USERNAME, PASSWORD, host=HOST, port=PORT) as conn:
# Get image
image = conn.getObject("Image", image_id)
print(f"Processing: {image.getName()}")
# Create multiple ROIs
updateService = conn.getUpdateService()
# ROI 1: Rectangle
roi1 = omero.model.RoiI()
roi1.setImage(image._obj)
rect = omero.model.RectangleI()
rect.x = rdouble(50)
rect.y = rdouble(50)
rect.width = rdouble(100)
rect.height = rdouble(100)
rect.theZ = rint(0)
rect.theT = rint(0)
rect.textValue = rstring("Cell 1")
rect.strokeColor = rint(rgba_to_int(255, 0, 0, 255))
roi1.addShape(rect)
roi1 = updateService.saveAndReturnObject(roi1)
print(f"Created ROI 1: {roi1.getId().getValue()}")
# ROI 2: Ellipse
roi2 = omero.model.RoiI()
roi2.setImage(image._obj)
ellipse = omero.model.EllipseI()
ellipse.x = rdouble(200)
ellipse.y = rdouble(150)
ellipse.radiusX = rdouble(40)
ellipse.radiusY = rdouble(30)
ellipse.theZ = rint(0)
ellipse.theT = rint(0)
ellipse.textValue = rstring("Cell 2")
ellipse.strokeColor = rint(rgba_to_int(0, 255, 0, 255))
roi2.addShape(ellipse)
roi2 = updateService.saveAndReturnObject(roi2)
print(f"Created ROI 2: {roi2.getId().getValue()}")
# Retrieve and analyze
roi_service = conn.getRoiService()
result = roi_service.findByImage(image_id, None)
shape_ids = []
for roi in result.rois:
for shape in roi.copyShapes():
shape_ids.append(shape.id.val)
# Get statistics
stats = roi_service.getShapeStatsRestricted(shape_ids, 0, 0, [0])
for i, stat in enumerate(stats):
print(f"Shape {shape_ids[i]}:")
print(f" Mean intensity: {stat.mean[0]:.2f}")
```
## Best Practices
1. **Organize Shapes**: Group related shapes in single ROIs
2. **Label Shapes**: Use textValue for identification
3. **Set Z and T**: Always specify Z-section and timepoint
4. **Color Coding**: Use consistent colors for shape types
5. **Validate Coordinates**: Ensure shapes are within image bounds
6. **Batch Creation**: Create multiple ROIs in single transaction when possible
7. **Delete Unused**: Remove temporary or test ROIs
8. **Export Data**: Store ROI statistics in tables for later analysis
9. **Version Control**: Document ROI creation methods in annotations
10. **Performance**: Use shape statistics service instead of manual pixel extraction

View File

@@ -0,0 +1,637 @@
# Scripts & Batch Operations
This reference covers creating OMERO.scripts for server-side processing and batch operations.
## OMERO.scripts Overview
OMERO.scripts are Python scripts that run on the OMERO server and can be called from OMERO clients (web, insight, CLI). They function as plugins that extend OMERO functionality.
### Key Features
- **Server-Side Execution**: Scripts run on the server, avoiding data transfer
- **Client Integration**: Callable from any OMERO client with auto-generated UI
- **Parameter Handling**: Define input parameters with validation
- **Result Reporting**: Return images, files, or messages to clients
- **Batch Processing**: Process multiple images or datasets efficiently
## Basic Script Structure
### Minimal Script Template
```python
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import omero
from omero.gateway import BlitzGateway
import omero.scripts as scripts
from omero.rtypes import rlong, rstring, robject
def run_script():
"""
Main script function.
"""
# Script definition
client = scripts.client(
'Script_Name.py',
"""
Description of what this script does.
""",
# Input parameters
scripts.String("Data_Type", optional=False, grouping="1",
description="Choose source of images",
values=[rstring('Dataset'), rstring('Image')],
default=rstring('Dataset')),
scripts.Long("IDs", optional=False, grouping="2",
description="Dataset or Image ID(s)").ofType(rlong(0)),
# Outputs
namespaces=[omero.constants.namespaces.NSDYNAMIC],
version="1.0"
)
try:
# Get connection
conn = BlitzGateway(client_obj=client)
# Get script parameters
script_params = client.getInputs(unwrap=True)
data_type = script_params["Data_Type"]
ids = script_params["IDs"]
# Process data
message = process_data(conn, data_type, ids)
# Return results
client.setOutput("Message", rstring(message))
finally:
client.closeSession()
def process_data(conn, data_type, ids):
"""
Process images based on parameters.
"""
# Implementation here
return "Processing complete"
if __name__ == "__main__":
run_script()
```
## Script Parameters
### Parameter Types
```python
# String parameter
scripts.String("Name", optional=False,
description="Enter a name")
# String with choices
scripts.String("Mode", optional=False,
values=[rstring('Fast'), rstring('Accurate')],
default=rstring('Fast'))
# Integer parameter
scripts.Long("ImageID", optional=False,
description="Image to process").ofType(rlong(0))
# List of integers
scripts.List("ImageIDs", optional=False,
description="Multiple images").ofType(rlong(0))
# Float parameter
scripts.Float("Threshold", optional=True,
description="Threshold value",
min=0.0, max=1.0, default=0.5)
# Boolean parameter
scripts.Bool("SaveResults", optional=True,
description="Save results to OMERO",
default=True)
```
### Parameter Grouping
```python
# Group related parameters
scripts.String("Data_Type", grouping="1",
description="Source type",
values=[rstring('Dataset'), rstring('Image')])
scripts.Long("Dataset_ID", grouping="1.1",
description="Dataset ID").ofType(rlong(0))
scripts.List("Image_IDs", grouping="1.2",
description="Image IDs").ofType(rlong(0))
```
## Accessing Input Data
### Get Script Parameters
```python
# Inside run_script()
client = scripts.client(...)
# Get parameters as Python objects
script_params = client.getInputs(unwrap=True)
# Access individual parameters
data_type = script_params.get("Data_Type", "Image")
image_ids = script_params.get("Image_IDs", [])
threshold = script_params.get("Threshold", 0.5)
save_results = script_params.get("SaveResults", True)
```
### Get Images from Parameters
```python
def get_images_from_params(conn, script_params):
"""
Get image objects based on script parameters.
"""
images = []
data_type = script_params["Data_Type"]
if data_type == "Dataset":
dataset_id = script_params["Dataset_ID"]
dataset = conn.getObject("Dataset", dataset_id)
if dataset:
images = list(dataset.listChildren())
elif data_type == "Image":
image_ids = script_params["Image_IDs"]
for image_id in image_ids:
image = conn.getObject("Image", image_id)
if image:
images.append(image)
return images
```
## Processing Images
### Batch Image Processing
```python
def process_images(conn, images, threshold):
"""
Process multiple images.
"""
results = []
for image in images:
print(f"Processing: {image.getName()}")
# Get pixel data
pixels = image.getPrimaryPixels()
size_z = image.getSizeZ()
size_c = image.getSizeC()
size_t = image.getSizeT()
# Process each plane
for z in range(size_z):
for c in range(size_c):
for t in range(size_t):
plane = pixels.getPlane(z, c, t)
# Apply threshold
binary = (plane > threshold).astype(np.uint8)
# Count features
feature_count = count_features(binary)
results.append({
'image_id': image.getId(),
'image_name': image.getName(),
'z': z, 'c': c, 't': t,
'feature_count': feature_count
})
return results
```
## Generating Outputs
### Return Messages
```python
# Simple message
message = "Processed 10 images successfully"
client.setOutput("Message", rstring(message))
# Detailed message
message = "Results:\n"
for result in results:
message += f"Image {result['image_id']}: {result['count']} cells\n"
client.setOutput("Message", rstring(message))
```
### Return Images
```python
# Return newly created image
new_image = conn.createImageFromNumpySeq(...)
client.setOutput("New_Image", robject(new_image._obj))
```
### Return Files
```python
# Create and return file annotation
file_ann = conn.createFileAnnfromLocalFile(
output_file_path,
mimetype="text/csv",
ns="analysis.results"
)
client.setOutput("Result_File", robject(file_ann._obj))
```
### Return Tables
```python
# Create OMERO table and return
resources = conn.c.sf.sharedResources()
table = create_results_table(resources, results)
orig_file = table.getOriginalFile()
table.close()
# Create file annotation
file_ann = omero.model.FileAnnotationI()
file_ann.setFile(orig_file)
file_ann = conn.getUpdateService().saveAndReturnObject(file_ann)
client.setOutput("Results_Table", robject(file_ann._obj))
```
## Complete Example Scripts
### Example 1: Maximum Intensity Projection
```python
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import omero
from omero.gateway import BlitzGateway
import omero.scripts as scripts
from omero.rtypes import rlong, rstring, robject
import numpy as np
def run_script():
client = scripts.client(
'Maximum_Intensity_Projection.py',
"""
Creates maximum intensity projection from Z-stack images.
""",
scripts.String("Data_Type", optional=False, grouping="1",
description="Process images from",
values=[rstring('Dataset'), rstring('Image')],
default=rstring('Image')),
scripts.List("IDs", optional=False, grouping="2",
description="Dataset or Image ID(s)").ofType(rlong(0)),
scripts.Bool("Link_to_Source", optional=True, grouping="3",
description="Link results to source dataset",
default=True),
version="1.0"
)
try:
conn = BlitzGateway(client_obj=client)
script_params = client.getInputs(unwrap=True)
# Get images
images = get_images(conn, script_params)
created_images = []
for image in images:
print(f"Processing: {image.getName()}")
# Create MIP
mip_image = create_mip(conn, image)
if mip_image:
created_images.append(mip_image)
# Report results
if created_images:
message = f"Created {len(created_images)} MIP images"
# Return first image for display
client.setOutput("Message", rstring(message))
client.setOutput("Result", robject(created_images[0]._obj))
else:
client.setOutput("Message", rstring("No images created"))
finally:
client.closeSession()
def get_images(conn, script_params):
"""Get images from script parameters."""
images = []
data_type = script_params["Data_Type"]
ids = script_params["IDs"]
if data_type == "Dataset":
for dataset_id in ids:
dataset = conn.getObject("Dataset", dataset_id)
if dataset:
images.extend(list(dataset.listChildren()))
else:
for image_id in ids:
image = conn.getObject("Image", image_id)
if image:
images.append(image)
return images
def create_mip(conn, source_image):
"""Create maximum intensity projection."""
pixels = source_image.getPrimaryPixels()
size_z = source_image.getSizeZ()
size_c = source_image.getSizeC()
size_t = source_image.getSizeT()
if size_z == 1:
print(" Skipping (single Z-section)")
return None
def plane_gen():
for c in range(size_c):
for t in range(size_t):
# Get Z-stack
z_stack = []
for z in range(size_z):
plane = pixels.getPlane(z, c, t)
z_stack.append(plane)
# Maximum projection
max_proj = np.max(z_stack, axis=0)
yield max_proj
# Create new image
new_image = conn.createImageFromNumpySeq(
plane_gen(),
f"{source_image.getName()}_MIP",
1, size_c, size_t,
description="Maximum intensity projection",
dataset=source_image.getParent()
)
return new_image
if __name__ == "__main__":
run_script()
```
### Example 2: Batch ROI Analysis
```python
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import omero
from omero.gateway import BlitzGateway
import omero.scripts as scripts
from omero.rtypes import rlong, rstring, robject
import omero.grid
def run_script():
client = scripts.client(
'Batch_ROI_Analysis.py',
"""
Analyzes ROIs across multiple images and creates results table.
""",
scripts.Long("Dataset_ID", optional=False,
description="Dataset with images and ROIs").ofType(rlong(0)),
scripts.Long("Channel_Index", optional=True,
description="Channel to analyze (0-indexed)",
default=0, min=0),
version="1.0"
)
try:
conn = BlitzGateway(client_obj=client)
script_params = client.getInputs(unwrap=True)
dataset_id = script_params["Dataset_ID"]
channel_index = script_params["Channel_Index"]
# Get dataset
dataset = conn.getObject("Dataset", dataset_id)
if not dataset:
client.setOutput("Message", rstring("Dataset not found"))
return
# Analyze ROIs
results = analyze_rois(conn, dataset, channel_index)
# Create table
table_file = create_results_table(conn, dataset, results)
# Report
message = f"Analyzed {len(results)} ROIs from {dataset.getName()}"
client.setOutput("Message", rstring(message))
client.setOutput("Results_Table", robject(table_file._obj))
finally:
client.closeSession()
def analyze_rois(conn, dataset, channel_index):
"""Analyze all ROIs in dataset images."""
roi_service = conn.getRoiService()
results = []
for image in dataset.listChildren():
result = roi_service.findByImage(image.getId(), None)
if not result.rois:
continue
# Get shape IDs
shape_ids = []
for roi in result.rois:
for shape in roi.copyShapes():
shape_ids.append(shape.id.val)
# Get statistics
stats = roi_service.getShapeStatsRestricted(
shape_ids, 0, 0, [channel_index]
)
# Store results
for i, stat in enumerate(stats):
results.append({
'image_id': image.getId(),
'image_name': image.getName(),
'shape_id': shape_ids[i],
'mean': stat.mean[channel_index],
'min': stat.min[channel_index],
'max': stat.max[channel_index],
'sum': stat.sum[channel_index],
'area': stat.pointsCount[channel_index]
})
return results
def create_results_table(conn, dataset, results):
"""Create OMERO table from results."""
# Prepare data
image_ids = [r['image_id'] for r in results]
shape_ids = [r['shape_id'] for r in results]
means = [r['mean'] for r in results]
mins = [r['min'] for r in results]
maxs = [r['max'] for r in results]
sums = [r['sum'] for r in results]
areas = [r['area'] for r in results]
# Create table
resources = conn.c.sf.sharedResources()
repository_id = resources.repositories().descriptions[0].getId().getValue()
table = resources.newTable(repository_id, f"ROI_Analysis_{dataset.getId()}")
# Define columns
columns = [
omero.grid.ImageColumn('Image', 'Source image', []),
omero.grid.LongColumn('ShapeID', 'ROI shape ID', []),
omero.grid.DoubleColumn('Mean', 'Mean intensity', []),
omero.grid.DoubleColumn('Min', 'Min intensity', []),
omero.grid.DoubleColumn('Max', 'Max intensity', []),
omero.grid.DoubleColumn('Sum', 'Integrated density', []),
omero.grid.LongColumn('Area', 'Area in pixels', [])
]
table.initialize(columns)
# Add data
data = [
omero.grid.ImageColumn('Image', 'Source image', image_ids),
omero.grid.LongColumn('ShapeID', 'ROI shape ID', shape_ids),
omero.grid.DoubleColumn('Mean', 'Mean intensity', means),
omero.grid.DoubleColumn('Min', 'Min intensity', mins),
omero.grid.DoubleColumn('Max', 'Max intensity', maxs),
omero.grid.DoubleColumn('Sum', 'Integrated density', sums),
omero.grid.LongColumn('Area', 'Area in pixels', areas)
]
table.addData(data)
orig_file = table.getOriginalFile()
table.close()
# Link to dataset
file_ann = omero.model.FileAnnotationI()
file_ann.setFile(orig_file)
file_ann = conn.getUpdateService().saveAndReturnObject(file_ann)
link = omero.model.DatasetAnnotationLinkI()
link.setParent(dataset._obj)
link.setChild(file_ann)
conn.getUpdateService().saveAndReturnObject(link)
return file_ann
if __name__ == "__main__":
run_script()
```
## Script Deployment
### Installation Location
Scripts should be placed in the OMERO server scripts directory:
```
OMERO_DIR/lib/scripts/
```
### Recommended Structure
```
lib/scripts/
├── analysis/
│ ├── Cell_Counter.py
│ └── ROI_Analyzer.py
├── export/
│ ├── Export_Images.py
│ └── Export_ROIs.py
└── util/
└── Helper_Functions.py
```
### Testing Scripts
```bash
# Test script syntax
python Script_Name.py
# Upload to OMERO
omero script upload Script_Name.py
# List scripts
omero script list
# Run script from CLI
omero script launch Script_ID Dataset_ID=123
```
## Best Practices
1. **Error Handling**: Always use try-finally to close session
2. **Progress Updates**: Print status messages for long operations
3. **Parameter Validation**: Check parameters before processing
4. **Memory Management**: Process large datasets in batches
5. **Documentation**: Include clear description and parameter docs
6. **Versioning**: Include version number in script
7. **Namespaces**: Use appropriate namespaces for outputs
8. **Return Objects**: Return created objects for client display
9. **Logging**: Use print() for server logs
10. **Testing**: Test with various input combinations
## Common Patterns
### Progress Reporting
```python
total = len(images)
for idx, image in enumerate(images):
print(f"Processing {idx + 1}/{total}: {image.getName()}")
# Process image
```
### Error Collection
```python
errors = []
for image in images:
try:
process_image(image)
except Exception as e:
errors.append(f"{image.getName()}: {str(e)}")
if errors:
message = "Completed with errors:\n" + "\n".join(errors)
else:
message = "All images processed successfully"
```
### Resource Cleanup
```python
try:
# Script processing
pass
finally:
# Clean up temporary files
if os.path.exists(temp_file):
os.remove(temp_file)
client.closeSession()
```

View File

@@ -0,0 +1,532 @@
# OMERO Tables
This reference covers creating and managing structured tabular data in OMERO using OMERO.tables.
## OMERO.tables Overview
OMERO.tables provides a way to store structured tabular data associated with OMERO objects. Tables are stored as HDF5 files and can be queried efficiently. Common use cases include:
- Storing quantitative measurements from images
- Recording analysis results
- Tracking experimental metadata
- Linking measurements to specific images or ROIs
## Column Types
OMERO.tables supports various column types:
- **LongColumn**: Integer values (64-bit)
- **DoubleColumn**: Floating-point values
- **StringColumn**: Text data (fixed max length)
- **BoolColumn**: Boolean values
- **LongArrayColumn**: Arrays of integers
- **DoubleArrayColumn**: Arrays of floats
- **FileColumn**: References to OMERO files
- **ImageColumn**: References to OMERO images
- **RoiColumn**: References to OMERO ROIs
- **WellColumn**: References to OMERO wells
## Creating Tables
### Basic Table Creation
```python
from random import random
import omero.grid
# Create unique table name
table_name = f"MyAnalysisTable_{random()}"
# Define columns (empty data for initialization)
col1 = omero.grid.LongColumn('ImageID', 'Image identifier', [])
col2 = omero.grid.DoubleColumn('MeanIntensity', 'Mean pixel intensity', [])
col3 = omero.grid.StringColumn('Category', 'Classification', 64, [])
columns = [col1, col2, col3]
# Get resources and create table
resources = conn.c.sf.sharedResources()
repository_id = resources.repositories().descriptions[0].getId().getValue()
table = resources.newTable(repository_id, table_name)
# Initialize table with column definitions
table.initialize(columns)
```
### Add Data to Table
```python
# Prepare data
image_ids = [1, 2, 3, 4, 5]
intensities = [123.4, 145.2, 98.7, 156.3, 132.8]
categories = ["Good", "Good", "Poor", "Excellent", "Good"]
# Create data columns
data_col1 = omero.grid.LongColumn('ImageID', 'Image identifier', image_ids)
data_col2 = omero.grid.DoubleColumn('MeanIntensity', 'Mean pixel intensity', intensities)
data_col3 = omero.grid.StringColumn('Category', 'Classification', 64, categories)
data = [data_col1, data_col2, data_col3]
# Add data to table
table.addData(data)
# Get file reference
orig_file = table.getOriginalFile()
table.close() # Always close table when done
```
### Link Table to Dataset
```python
# Create file annotation from table
orig_file_id = orig_file.id.val
file_ann = omero.model.FileAnnotationI()
file_ann.setFile(omero.model.OriginalFileI(orig_file_id, False))
file_ann = conn.getUpdateService().saveAndReturnObject(file_ann)
# Link to dataset
link = omero.model.DatasetAnnotationLinkI()
link.setParent(omero.model.DatasetI(dataset_id, False))
link.setChild(omero.model.FileAnnotationI(file_ann.getId().getValue(), False))
conn.getUpdateService().saveAndReturnObject(link)
print(f"Linked table to dataset {dataset_id}")
```
## Column Types in Detail
### Long Column (Integers)
```python
# Column for integer values
image_ids = [101, 102, 103, 104, 105]
col = omero.grid.LongColumn('ImageID', 'Image identifier', image_ids)
```
### Double Column (Floats)
```python
# Column for floating-point values
measurements = [12.34, 56.78, 90.12, 34.56, 78.90]
col = omero.grid.DoubleColumn('Measurement', 'Value in microns', measurements)
```
### String Column (Text)
```python
# Column for text (max length required)
labels = ["Control", "Treatment A", "Treatment B", "Control", "Treatment A"]
col = omero.grid.StringColumn('Condition', 'Experimental condition', 64, labels)
```
### Boolean Column
```python
# Column for boolean values
flags = [True, False, True, True, False]
col = omero.grid.BoolColumn('QualityPass', 'Passes quality control', flags)
```
### Image Column (References to Images)
```python
# Column linking to OMERO images
image_ids = [101, 102, 103, 104, 105]
col = omero.grid.ImageColumn('Image', 'Source image', image_ids)
```
### ROI Column (References to ROIs)
```python
# Column linking to OMERO ROIs
roi_ids = [201, 202, 203, 204, 205]
col = omero.grid.RoiColumn('ROI', 'Associated ROI', roi_ids)
```
### Array Columns
```python
# Column for arrays of doubles
histogram_data = [
[10, 20, 30, 40],
[15, 25, 35, 45],
[12, 22, 32, 42]
]
col = omero.grid.DoubleArrayColumn('Histogram', 'Intensity histogram', histogram_data)
# Column for arrays of longs
bin_counts = [[5, 10, 15], [8, 12, 16], [6, 11, 14]]
col = omero.grid.LongArrayColumn('Bins', 'Histogram bins', bin_counts)
```
## Reading Table Data
### Open Existing Table
```python
# Get table file by name
orig_table_file = conn.getObject("OriginalFile",
attributes={'name': table_name})
# Open table
resources = conn.c.sf.sharedResources()
table = resources.openTable(orig_table_file._obj)
print(f"Opened table: {table.getOriginalFile().getName().getValue()}")
print(f"Number of rows: {table.getNumberOfRows()}")
```
### Read All Data
```python
# Get column headers
print("Columns:")
for col in table.getHeaders():
print(f" {col.name}: {col.description}")
# Read all data
row_count = table.getNumberOfRows()
data = table.readCoordinates(range(row_count))
# Display data
for col in data.columns:
print(f"\nColumn: {col.name}")
for value in col.values:
print(f" {value}")
table.close()
```
### Read Specific Rows
```python
# Read rows 10-20
start = 10
stop = 20
data = table.read(list(range(table.getHeaders().__len__())), start, stop)
for col in data.columns:
print(f"Column: {col.name}")
for value in col.values:
print(f" {value}")
```
### Read Specific Columns
```python
# Read only columns 0 and 2
column_indices = [0, 2]
start = 0
stop = table.getNumberOfRows()
data = table.read(column_indices, start, stop)
for col in data.columns:
print(f"Column: {col.name}")
print(f"Values: {col.values}")
```
## Querying Tables
### Query with Conditions
```python
# Query rows where MeanIntensity > 100
row_count = table.getNumberOfRows()
query_rows = table.getWhereList(
"(MeanIntensity > 100)",
variables={},
start=0,
stop=row_count,
step=0
)
print(f"Found {len(query_rows)} matching rows")
# Read matching rows
data = table.readCoordinates(query_rows)
for col in data.columns:
print(f"\n{col.name}:")
for value in col.values:
print(f" {value}")
```
### Complex Queries
```python
# Multiple conditions with AND
query_rows = table.getWhereList(
"(MeanIntensity > 100) & (MeanIntensity < 150)",
variables={},
start=0,
stop=row_count,
step=0
)
# Multiple conditions with OR
query_rows = table.getWhereList(
"(Category == 'Good') | (Category == 'Excellent')",
variables={},
start=0,
stop=row_count,
step=0
)
# String matching
query_rows = table.getWhereList(
"(Category == 'Good')",
variables={},
start=0,
stop=row_count,
step=0
)
```
## Complete Example: Image Analysis Results
```python
from omero.gateway import BlitzGateway
import omero.grid
import omero.model
import numpy as np
HOST = 'omero.example.com'
PORT = 4064
USERNAME = 'user'
PASSWORD = 'pass'
with BlitzGateway(USERNAME, PASSWORD, host=HOST, port=PORT) as conn:
# Get dataset
dataset = conn.getObject("Dataset", dataset_id)
print(f"Analyzing dataset: {dataset.getName()}")
# Collect measurements from images
image_ids = []
mean_intensities = []
max_intensities = []
cell_counts = []
for image in dataset.listChildren():
image_ids.append(image.getId())
# Get pixel data
pixels = image.getPrimaryPixels()
plane = pixels.getPlane(0, 0, 0) # Z=0, C=0, T=0
# Calculate statistics
mean_intensities.append(float(np.mean(plane)))
max_intensities.append(float(np.max(plane)))
# Simulate cell count (would be from actual analysis)
cell_counts.append(np.random.randint(50, 200))
# Create table
table_name = f"Analysis_Results_{dataset.getId()}"
# Define columns
col1 = omero.grid.ImageColumn('Image', 'Source image', [])
col2 = omero.grid.DoubleColumn('MeanIntensity', 'Mean pixel value', [])
col3 = omero.grid.DoubleColumn('MaxIntensity', 'Maximum pixel value', [])
col4 = omero.grid.LongColumn('CellCount', 'Number of cells detected', [])
# Initialize table
resources = conn.c.sf.sharedResources()
repository_id = resources.repositories().descriptions[0].getId().getValue()
table = resources.newTable(repository_id, table_name)
table.initialize([col1, col2, col3, col4])
# Add data
data_col1 = omero.grid.ImageColumn('Image', 'Source image', image_ids)
data_col2 = omero.grid.DoubleColumn('MeanIntensity', 'Mean pixel value',
mean_intensities)
data_col3 = omero.grid.DoubleColumn('MaxIntensity', 'Maximum pixel value',
max_intensities)
data_col4 = omero.grid.LongColumn('CellCount', 'Number of cells detected',
cell_counts)
table.addData([data_col1, data_col2, data_col3, data_col4])
# Get file and close table
orig_file = table.getOriginalFile()
table.close()
# Link to dataset
orig_file_id = orig_file.id.val
file_ann = omero.model.FileAnnotationI()
file_ann.setFile(omero.model.OriginalFileI(orig_file_id, False))
file_ann = conn.getUpdateService().saveAndReturnObject(file_ann)
link = omero.model.DatasetAnnotationLinkI()
link.setParent(omero.model.DatasetI(dataset_id, False))
link.setChild(omero.model.FileAnnotationI(file_ann.getId().getValue(), False))
conn.getUpdateService().saveAndReturnObject(link)
print(f"Created and linked table with {len(image_ids)} rows")
# Query results
table = resources.openTable(orig_file)
high_cell_count_rows = table.getWhereList(
"(CellCount > 100)",
variables={},
start=0,
stop=table.getNumberOfRows(),
step=0
)
print(f"Images with >100 cells: {len(high_cell_count_rows)}")
# Read those rows
data = table.readCoordinates(high_cell_count_rows)
for i in range(len(high_cell_count_rows)):
img_id = data.columns[0].values[i]
count = data.columns[3].values[i]
print(f" Image {img_id}: {count} cells")
table.close()
```
## Retrieve Tables from Objects
### Find Tables Attached to Dataset
```python
# Get dataset
dataset = conn.getObject("Dataset", dataset_id)
# List file annotations
for ann in dataset.listAnnotations():
if isinstance(ann, omero.gateway.FileAnnotationWrapper):
file_obj = ann.getFile()
file_name = file_obj.getName()
# Check if it's a table (might have specific naming pattern)
if "Table" in file_name or file_name.endswith(".h5"):
print(f"Found table: {file_name} (ID: {file_obj.getId()})")
# Open and inspect
resources = conn.c.sf.sharedResources()
table = resources.openTable(file_obj._obj)
print(f" Rows: {table.getNumberOfRows()}")
print(f" Columns:")
for col in table.getHeaders():
print(f" {col.name}")
table.close()
```
## Updating Tables
### Append Rows
```python
# Open existing table
resources = conn.c.sf.sharedResources()
table = resources.openTable(orig_file._obj)
# Prepare new data
new_image_ids = [106, 107]
new_intensities = [88.9, 92.3]
new_categories = ["Good", "Excellent"]
# Create data columns
data_col1 = omero.grid.LongColumn('ImageID', '', new_image_ids)
data_col2 = omero.grid.DoubleColumn('MeanIntensity', '', new_intensities)
data_col3 = omero.grid.StringColumn('Category', '', 64, new_categories)
# Append data
table.addData([data_col1, data_col2, data_col3])
print(f"New row count: {table.getNumberOfRows()}")
table.close()
```
## Deleting Tables
### Delete Table File
```python
# Get file object
orig_file = conn.getObject("OriginalFile", file_id)
# Delete file (also deletes table)
conn.deleteObjects("OriginalFile", [file_id], wait=True)
print(f"Deleted table file {file_id}")
```
### Unlink Table from Object
```python
# Find annotation links
dataset = conn.getObject("Dataset", dataset_id)
for ann in dataset.listAnnotations():
if isinstance(ann, omero.gateway.FileAnnotationWrapper):
if "Table" in ann.getFile().getName():
# Delete link (keeps table, removes association)
conn.deleteObjects("DatasetAnnotationLink",
[ann.link.getId()],
wait=True)
print(f"Unlinked table from dataset")
```
## Best Practices
1. **Descriptive Names**: Use meaningful table and column names
2. **Close Tables**: Always close tables after use
3. **String Length**: Set appropriate max length for string columns
4. **Link to Objects**: Attach tables to relevant datasets or projects
5. **Use References**: Use ImageColumn, RoiColumn for object references
6. **Query Efficiently**: Use getWhereList() instead of reading all data
7. **Document**: Add descriptions to columns
8. **Version Control**: Include version info in table name or metadata
9. **Batch Operations**: Add data in batches for better performance
10. **Error Handling**: Check for None returns and handle exceptions
## Common Patterns
### ROI Measurements Table
```python
# Table structure for ROI measurements
columns = [
omero.grid.ImageColumn('Image', 'Source image', []),
omero.grid.RoiColumn('ROI', 'Measured ROI', []),
omero.grid.LongColumn('ChannelIndex', 'Channel number', []),
omero.grid.DoubleColumn('Area', 'ROI area in pixels', []),
omero.grid.DoubleColumn('MeanIntensity', 'Mean intensity', []),
omero.grid.DoubleColumn('IntegratedDensity', 'Sum of intensities', []),
omero.grid.StringColumn('CellType', 'Cell classification', 32, [])
]
```
### Time Series Data Table
```python
# Table structure for time series measurements
columns = [
omero.grid.ImageColumn('Image', 'Time series image', []),
omero.grid.LongColumn('Timepoint', 'Time index', []),
omero.grid.DoubleColumn('Timestamp', 'Time in seconds', []),
omero.grid.DoubleColumn('Value', 'Measured value', []),
omero.grid.StringColumn('Measurement', 'Type of measurement', 64, [])
]
```
### Screening Results Table
```python
# Table structure for screening plate analysis
columns = [
omero.grid.WellColumn('Well', 'Plate well', []),
omero.grid.LongColumn('FieldIndex', 'Field number', []),
omero.grid.DoubleColumn('CellCount', 'Number of cells', []),
omero.grid.DoubleColumn('Viability', 'Percent viable', []),
omero.grid.StringColumn('Phenotype', 'Observed phenotype', 128, []),
omero.grid.BoolColumn('Hit', 'Hit in screen', [])
]
```

View File

@@ -0,0 +1,555 @@
---
name: opentrons-integration
description: Toolkit for creating, editing, and debugging Opentrons Python Protocol API v2 protocols for laboratory automation. This skill should be used when working with Opentrons Flex or OT-2 robots, writing liquid handling protocols, automating pipetting tasks, controlling hardware modules (heater-shaker, temperature, magnetic, thermocycler, absorbance plate reader), managing labware and deck layouts, or performing any laboratory automation tasks using the Opentrons platform. Use this skill for protocol development, troubleshooting, simulation, and optimizing automated workflows for biological and chemical experiments.
---
# Opentrons Integration
## Overview
Opentrons provides a Python-based automation platform for laboratory protocols using Flex and OT-2 robots. This skill enables creation and management of Python Protocol API v2 protocols for automated liquid handling, hardware module control, and complex laboratory workflows.
## Core Capabilities
### 1. Protocol Structure and Metadata
Every Opentrons protocol follows a standard structure:
```python
from opentrons import protocol_api
# Metadata
metadata = {
'protocolName': 'My Protocol',
'author': 'Name <email@example.com>',
'description': 'Protocol description',
'apiLevel': '2.19' # Use latest available API version
}
# Requirements (optional)
requirements = {
'robotType': 'Flex', # or 'OT-2'
'apiLevel': '2.19'
}
# Run function
def run(protocol: protocol_api.ProtocolContext):
# Protocol commands go here
pass
```
**Key elements:**
- Import `protocol_api` from `opentrons`
- Define `metadata` dict with protocolName, author, description, apiLevel
- Optional `requirements` dict for robot type and API version
- Implement `run()` function receiving `ProtocolContext` as parameter
- All protocol logic goes inside the `run()` function
### 2. Loading Hardware
**Loading Instruments (Pipettes):**
```python
def run(protocol: protocol_api.ProtocolContext):
# Load pipette on specific mount
left_pipette = protocol.load_instrument(
'p1000_single_flex', # Instrument name
'left', # Mount: 'left' or 'right'
tip_racks=[tip_rack] # List of tip rack labware objects
)
```
Common pipette names:
- Flex: `p50_single_flex`, `p1000_single_flex`, `p50_multi_flex`, `p1000_multi_flex`
- OT-2: `p20_single_gen2`, `p300_single_gen2`, `p1000_single_gen2`, `p20_multi_gen2`, `p300_multi_gen2`
**Loading Labware:**
```python
# Load labware directly on deck
plate = protocol.load_labware(
'corning_96_wellplate_360ul_flat', # Labware API name
'D1', # Deck slot (Flex: A1-D3, OT-2: 1-11)
label='Sample Plate' # Optional display label
)
# Load tip rack
tip_rack = protocol.load_labware('opentrons_flex_96_tiprack_1000ul', 'C1')
# Load labware on adapter
adapter = protocol.load_adapter('opentrons_flex_96_tiprack_adapter', 'B1')
tips = adapter.load_labware('opentrons_flex_96_tiprack_200ul')
```
**Loading Modules:**
```python
# Temperature module
temp_module = protocol.load_module('temperature module gen2', 'D3')
temp_plate = temp_module.load_labware('corning_96_wellplate_360ul_flat')
# Magnetic module
mag_module = protocol.load_module('magnetic module gen2', 'C2')
mag_plate = mag_module.load_labware('nest_96_wellplate_100ul_pcr_full_skirt')
# Heater-Shaker module
hs_module = protocol.load_module('heaterShakerModuleV1', 'D1')
hs_plate = hs_module.load_labware('corning_96_wellplate_360ul_flat')
# Thermocycler module (takes up specific slots automatically)
tc_module = protocol.load_module('thermocyclerModuleV2')
tc_plate = tc_module.load_labware('nest_96_wellplate_100ul_pcr_full_skirt')
```
### 3. Liquid Handling Operations
**Basic Operations:**
```python
# Pick up tip
pipette.pick_up_tip()
# Aspirate (draw liquid in)
pipette.aspirate(
volume=100, # Volume in µL
location=source['A1'] # Well or location object
)
# Dispense (expel liquid)
pipette.dispense(
volume=100,
location=dest['B1']
)
# Drop tip
pipette.drop_tip()
# Return tip to rack
pipette.return_tip()
```
**Complex Operations:**
```python
# Transfer (combines pick_up, aspirate, dispense, drop_tip)
pipette.transfer(
volume=100,
source=source_plate['A1'],
dest=dest_plate['B1'],
new_tip='always' # 'always', 'once', or 'never'
)
# Distribute (one source to multiple destinations)
pipette.distribute(
volume=50,
source=reservoir['A1'],
dest=[plate['A1'], plate['A2'], plate['A3']],
new_tip='once'
)
# Consolidate (multiple sources to one destination)
pipette.consolidate(
volume=50,
source=[plate['A1'], plate['A2'], plate['A3']],
dest=reservoir['A1'],
new_tip='once'
)
```
**Advanced Techniques:**
```python
# Mix (aspirate and dispense in same location)
pipette.mix(
repetitions=3,
volume=50,
location=plate['A1']
)
# Air gap (prevent dripping)
pipette.aspirate(100, source['A1'])
pipette.air_gap(20) # 20µL air gap
pipette.dispense(120, dest['A1'])
# Blow out (expel remaining liquid)
pipette.blow_out(location=dest['A1'].top())
# Touch tip (remove droplets on tip exterior)
pipette.touch_tip(location=plate['A1'])
```
**Flow Rate Control:**
```python
# Set flow rates (µL/s)
pipette.flow_rate.aspirate = 150
pipette.flow_rate.dispense = 300
pipette.flow_rate.blow_out = 400
```
### 4. Accessing Wells and Locations
**Well Access Methods:**
```python
# By name
well_a1 = plate['A1']
# By index
first_well = plate.wells()[0]
# All wells
all_wells = plate.wells() # Returns list
# By rows
rows = plate.rows() # Returns list of lists
row_a = plate.rows()[0] # All wells in row A
# By columns
columns = plate.columns() # Returns list of lists
column_1 = plate.columns()[0] # All wells in column 1
# Wells by name (dictionary)
wells_dict = plate.wells_by_name() # {'A1': Well, 'A2': Well, ...}
```
**Location Methods:**
```python
# Top of well (default: 1mm below top)
pipette.aspirate(100, well.top())
pipette.aspirate(100, well.top(z=5)) # 5mm above top
# Bottom of well (default: 1mm above bottom)
pipette.aspirate(100, well.bottom())
pipette.aspirate(100, well.bottom(z=2)) # 2mm above bottom
# Center of well
pipette.aspirate(100, well.center())
```
### 5. Hardware Module Control
**Temperature Module:**
```python
# Set temperature
temp_module.set_temperature(celsius=4)
# Wait for temperature
temp_module.await_temperature(celsius=4)
# Deactivate
temp_module.deactivate()
# Check status
current_temp = temp_module.temperature # Current temperature
target_temp = temp_module.target # Target temperature
```
**Magnetic Module:**
```python
# Engage (raise magnets)
mag_module.engage(height_from_base=10) # mm from labware base
# Disengage (lower magnets)
mag_module.disengage()
# Check status
is_engaged = mag_module.status # 'engaged' or 'disengaged'
```
**Heater-Shaker Module:**
```python
# Set temperature
hs_module.set_target_temperature(celsius=37)
# Wait for temperature
hs_module.wait_for_temperature()
# Set shake speed
hs_module.set_and_wait_for_shake_speed(rpm=500)
# Close labware latch
hs_module.close_labware_latch()
# Open labware latch
hs_module.open_labware_latch()
# Deactivate heater
hs_module.deactivate_heater()
# Deactivate shaker
hs_module.deactivate_shaker()
```
**Thermocycler Module:**
```python
# Open lid
tc_module.open_lid()
# Close lid
tc_module.close_lid()
# Set lid temperature
tc_module.set_lid_temperature(celsius=105)
# Set block temperature
tc_module.set_block_temperature(
temperature=95,
hold_time_seconds=30,
hold_time_minutes=0.5,
block_max_volume=50 # µL per well
)
# Execute profile (PCR cycling)
profile = [
{'temperature': 95, 'hold_time_seconds': 30},
{'temperature': 57, 'hold_time_seconds': 30},
{'temperature': 72, 'hold_time_seconds': 60}
]
tc_module.execute_profile(
steps=profile,
repetitions=30,
block_max_volume=50
)
# Deactivate
tc_module.deactivate_lid()
tc_module.deactivate_block()
```
**Absorbance Plate Reader:**
```python
# Initialize and read
result = plate_reader.read(wavelengths=[450, 650])
# Access readings
absorbance_data = result # Dict with wavelength keys
```
### 6. Liquid Tracking and Labeling
**Define Liquids:**
```python
# Define liquid types
water = protocol.define_liquid(
name='Water',
description='Ultrapure water',
display_color='#0000FF' # Hex color code
)
sample = protocol.define_liquid(
name='Sample',
description='Cell lysate sample',
display_color='#FF0000'
)
```
**Load Liquids into Wells:**
```python
# Load liquid into specific wells
reservoir['A1'].load_liquid(liquid=water, volume=50000) # µL
plate['A1'].load_liquid(liquid=sample, volume=100)
# Mark wells as empty
plate['B1'].load_empty()
```
### 7. Protocol Control and Utilities
**Execution Control:**
```python
# Pause protocol
protocol.pause(msg='Replace tip box and resume')
# Delay
protocol.delay(seconds=60)
protocol.delay(minutes=5)
# Comment (appears in logs)
protocol.comment('Starting serial dilution')
# Home robot
protocol.home()
```
**Conditional Logic:**
```python
# Check if simulating
if protocol.is_simulating():
protocol.comment('Running in simulation mode')
else:
protocol.comment('Running on actual robot')
```
**Rail Lights (Flex only):**
```python
# Turn lights on
protocol.set_rail_lights(on=True)
# Turn lights off
protocol.set_rail_lights(on=False)
```
### 8. Multi-Channel and 8-Channel Pipetting
When using multi-channel pipettes:
```python
# Load 8-channel pipette
multi_pipette = protocol.load_instrument(
'p300_multi_gen2',
'left',
tip_racks=[tips]
)
# Access entire column with single well reference
multi_pipette.transfer(
volume=100,
source=source_plate['A1'], # Accesses entire column 1
dest=dest_plate['A1'] # Dispenses to entire column 1
)
# Use rows() for row-wise operations
for row in plate.rows():
multi_pipette.transfer(100, reservoir['A1'], row[0])
```
### 9. Common Protocol Patterns
**Serial Dilution:**
```python
def run(protocol: protocol_api.ProtocolContext):
# Load labware
tips = protocol.load_labware('opentrons_flex_96_tiprack_200ul', 'D1')
reservoir = protocol.load_labware('nest_12_reservoir_15ml', 'D2')
plate = protocol.load_labware('corning_96_wellplate_360ul_flat', 'D3')
# Load pipette
p300 = protocol.load_instrument('p300_single_flex', 'left', tip_racks=[tips])
# Add diluent to all wells except first
p300.transfer(100, reservoir['A1'], plate.rows()[0][1:])
# Serial dilution across row
p300.transfer(
100,
plate.rows()[0][:11], # Source: wells 0-10
plate.rows()[0][1:], # Dest: wells 1-11
mix_after=(3, 50), # Mix 3x with 50µL after dispense
new_tip='always'
)
```
**Plate Replication:**
```python
def run(protocol: protocol_api.ProtocolContext):
# Load labware
tips = protocol.load_labware('opentrons_flex_96_tiprack_1000ul', 'C1')
source = protocol.load_labware('corning_96_wellplate_360ul_flat', 'D1')
dest = protocol.load_labware('corning_96_wellplate_360ul_flat', 'D2')
# Load pipette
p1000 = protocol.load_instrument('p1000_single_flex', 'left', tip_racks=[tips])
# Transfer from all wells in source to dest
p1000.transfer(
100,
source.wells(),
dest.wells(),
new_tip='always'
)
```
**PCR Setup:**
```python
def run(protocol: protocol_api.ProtocolContext):
# Load thermocycler
tc_mod = protocol.load_module('thermocyclerModuleV2')
tc_plate = tc_mod.load_labware('nest_96_wellplate_100ul_pcr_full_skirt')
# Load tips and reagents
tips = protocol.load_labware('opentrons_flex_96_tiprack_200ul', 'C1')
reagents = protocol.load_labware('opentrons_24_tuberack_nest_1.5ml_snapcap', 'D1')
# Load pipette
p300 = protocol.load_instrument('p300_single_flex', 'left', tip_racks=[tips])
# Open thermocycler lid
tc_mod.open_lid()
# Distribute master mix
p300.distribute(
20,
reagents['A1'],
tc_plate.wells(),
new_tip='once'
)
# Add samples (example for first 8 wells)
for i, well in enumerate(tc_plate.wells()[:8]):
p300.transfer(5, reagents.wells()[i+1], well, new_tip='always')
# Run PCR
tc_mod.close_lid()
tc_mod.set_lid_temperature(105)
# PCR profile
tc_mod.set_block_temperature(95, hold_time_seconds=180)
profile = [
{'temperature': 95, 'hold_time_seconds': 15},
{'temperature': 60, 'hold_time_seconds': 30},
{'temperature': 72, 'hold_time_seconds': 30}
]
tc_mod.execute_profile(steps=profile, repetitions=35, block_max_volume=25)
tc_mod.set_block_temperature(72, hold_time_minutes=5)
tc_mod.set_block_temperature(4)
tc_mod.deactivate_lid()
tc_mod.open_lid()
```
## Best Practices
1. **Always specify API level**: Use the latest stable API version in metadata
2. **Use meaningful labels**: Label labware for easier identification in logs
3. **Check tip availability**: Ensure sufficient tips for protocol completion
4. **Add comments**: Use `protocol.comment()` for debugging and logging
5. **Simulate first**: Always test protocols in simulation before running on robot
6. **Handle errors gracefully**: Add pauses for manual intervention when needed
7. **Consider timing**: Use delays when protocols require incubation periods
8. **Track liquids**: Use liquid tracking for better setup validation
9. **Optimize tip usage**: Use `new_tip='once'` when appropriate to save tips
10. **Control flow rates**: Adjust flow rates for viscous or volatile liquids
## Troubleshooting
**Common Issues:**
- **Out of tips**: Verify tip rack capacity matches protocol requirements
- **Labware collisions**: Check deck layout for spatial conflicts
- **Volume errors**: Ensure volumes don't exceed well or pipette capacities
- **Module not responding**: Verify module is properly connected and firmware is updated
- **Inaccurate volumes**: Calibrate pipettes and check for air bubbles
- **Protocol fails in simulation**: Check API version compatibility and labware definitions
## Resources
For detailed API documentation, see `references/api_reference.md` in this skill directory.
For example protocol templates, see `scripts/` directory.

View File

@@ -0,0 +1,366 @@
# Opentrons Python Protocol API v2 Reference
## Protocol Context Methods
### Labware Management
| Method | Description | Returns |
|--------|-------------|---------|
| `load_labware(name, location, label=None, namespace=None, version=None)` | Load labware onto deck | Labware object |
| `load_adapter(name, location, namespace=None, version=None)` | Load adapter onto deck | Labware object |
| `load_labware_from_definition(definition, location, label=None)` | Load custom labware from JSON | Labware object |
| `load_labware_on_adapter(name, adapter, label=None)` | Load labware on adapter | Labware object |
| `load_labware_by_name(name, location, label=None, namespace=None, version=None)` | Alternative load method | Labware object |
| `load_lid_stack(load_name, location, quantity=None)` | Load lid stack (Flex only) | Labware object |
### Instrument Management
| Method | Description | Returns |
|--------|-------------|---------|
| `load_instrument(instrument_name, mount, tip_racks=None, replace=False)` | Load pipette | InstrumentContext |
### Module Management
| Method | Description | Returns |
|--------|-------------|---------|
| `load_module(module_name, location=None, configuration=None)` | Load hardware module | ModuleContext |
### Liquid Management
| Method | Description | Returns |
|--------|-------------|---------|
| `define_liquid(name, description=None, display_color=None)` | Define liquid type | Liquid object |
### Execution Control
| Method | Description | Returns |
|--------|-------------|---------|
| `pause(msg=None)` | Pause protocol execution | None |
| `resume()` | Resume after pause | None |
| `delay(seconds=0, minutes=0, msg=None)` | Delay execution | None |
| `comment(msg)` | Add comment to protocol log | None |
| `home()` | Home all axes | None |
| `set_rail_lights(on)` | Control rail lights (Flex only) | None |
### Protocol Properties
| Property | Description | Type |
|----------|-------------|------|
| `deck` | Deck layout | Deck object |
| `fixed_trash` | Fixed trash location (OT-2) | TrashBin object |
| `loaded_labwares` | Dictionary of loaded labware | Dict |
| `loaded_instruments` | Dictionary of loaded instruments | Dict |
| `loaded_modules` | Dictionary of loaded modules | Dict |
| `is_simulating()` | Check if protocol is simulating | Bool |
| `bundled_data` | Access to bundled data files | Dict |
| `params` | Runtime parameters | ParametersContext |
## Instrument Context (Pipette) Methods
### Tip Management
| Method | Description | Returns |
|--------|-------------|---------|
| `pick_up_tip(location=None, presses=None, increment=None)` | Pick up tip | InstrumentContext |
| `drop_tip(location=None, home_after=True)` | Drop tip in trash | InstrumentContext |
| `return_tip(home_after=True)` | Return tip to rack | InstrumentContext |
| `reset_tipracks()` | Reset tip tracking | None |
### Liquid Handling - Basic
| Method | Description | Returns |
|--------|-------------|---------|
| `aspirate(volume=None, location=None, rate=1.0)` | Aspirate liquid | InstrumentContext |
| `dispense(volume=None, location=None, rate=1.0, push_out=None)` | Dispense liquid | InstrumentContext |
| `blow_out(location=None)` | Expel remaining liquid | InstrumentContext |
| `touch_tip(location=None, radius=1.0, v_offset=-1.0, speed=60.0)` | Remove droplets from tip | InstrumentContext |
| `mix(repetitions=1, volume=None, location=None, rate=1.0)` | Mix liquid | InstrumentContext |
| `air_gap(volume=None, height=None)` | Create air gap | InstrumentContext |
### Liquid Handling - Complex
| Method | Description | Returns |
|--------|-------------|---------|
| `transfer(volume, source, dest, **kwargs)` | Transfer liquid | InstrumentContext |
| `distribute(volume, source, dest, **kwargs)` | Distribute from one to many | InstrumentContext |
| `consolidate(volume, source, dest, **kwargs)` | Consolidate from many to one | InstrumentContext |
**transfer(), distribute(), consolidate() kwargs:**
- `new_tip`: 'always', 'once', or 'never'
- `trash`: True/False - trash tips after use
- `touch_tip`: True/False - touch tip after aspirate/dispense
- `blow_out`: True/False - blow out after dispense
- `mix_before`: (repetitions, volume) tuple
- `mix_after`: (repetitions, volume) tuple
- `disposal_volume`: Extra volume for contamination prevention
- `carryover`: True/False - enable multi-transfer for large volumes
- `gradient`: (start_concentration, end_concentration) for gradients
### Movement and Positioning
| Method | Description | Returns |
|--------|-------------|---------|
| `move_to(location, force_direct=False, minimum_z_height=None, speed=None)` | Move to location | InstrumentContext |
| `home()` | Home pipette axes | None |
### Pipette Properties
| Property | Description | Type |
|----------|-------------|------|
| `default_speed` | Default movement speed | Float |
| `min_volume` | Minimum pipette volume | Float |
| `max_volume` | Maximum pipette volume | Float |
| `current_volume` | Current volume in tip | Float |
| `has_tip` | Check if tip is attached | Bool |
| `name` | Pipette name | String |
| `model` | Pipette model | String |
| `mount` | Mount location | String |
| `channels` | Number of channels | Int |
| `tip_racks` | Associated tip racks | List |
| `trash_container` | Trash location | TrashBin object |
| `starting_tip` | Starting tip for protocol | Well object |
| `flow_rate` | Flow rate settings | FlowRates object |
### Flow Rate Properties
Access via `pipette.flow_rate`:
| Property | Description | Units |
|----------|-------------|-------|
| `aspirate` | Aspirate flow rate | µL/s |
| `dispense` | Dispense flow rate | µL/s |
| `blow_out` | Blow out flow rate | µL/s |
## Labware Methods
### Well Access
| Method | Description | Returns |
|--------|-------------|---------|
| `wells()` | Get all wells | List[Well] |
| `wells_by_name()` | Get wells dictionary | Dict[str, Well] |
| `rows()` | Get wells by row | List[List[Well]] |
| `columns()` | Get wells by column | List[List[Well]] |
| `rows_by_name()` | Get rows dictionary | Dict[str, List[Well]] |
| `columns_by_name()` | Get columns dictionary | Dict[str, List[Well]] |
### Labware Properties
| Property | Description | Type |
|----------|-------------|------|
| `name` | Labware name | String |
| `parent` | Parent location | Location object |
| `quirks` | Labware quirks list | List |
| `magdeck_engage_height` | Magnetic module height | Float |
| `uri` | Labware URI | String |
| `calibrated_offset` | Calibration offset | Point |
## Well Methods and Properties
### Liquid Operations
| Method | Description | Returns |
|--------|-------------|---------|
| `load_liquid(liquid, volume)` | Load liquid into well | None |
| `load_empty()` | Mark well as empty | None |
| `from_center_cartesian(x, y, z)` | Get location from center | Location |
### Location Methods
| Method | Description | Returns |
|--------|-------------|---------|
| `top(z=0)` | Get location at top of well | Location |
| `bottom(z=0)` | Get location at bottom of well | Location |
| `center()` | Get location at center of well | Location |
### Well Properties
| Property | Description | Type |
|----------|-------------|------|
| `diameter` | Well diameter (circular) | Float |
| `length` | Well length (rectangular) | Float |
| `width` | Well width (rectangular) | Float |
| `depth` | Well depth | Float |
| `max_volume` | Maximum volume | Float |
| `display_name` | Display name | String |
| `has_tip` | Check if tip present | Bool |
## Module Contexts
### Temperature Module
| Method | Description | Returns |
|--------|-------------|---------|
| `set_temperature(celsius)` | Set target temperature | None |
| `await_temperature(celsius)` | Wait for temperature | None |
| `deactivate()` | Turn off temperature control | None |
| `load_labware(name, label=None, namespace=None, version=None)` | Load labware on module | Labware |
**Properties:**
- `temperature`: Current temperature (°C)
- `target`: Target temperature (°C)
- `status`: 'idle', 'holding', 'cooling', or 'heating'
- `labware`: Loaded labware
### Magnetic Module
| Method | Description | Returns |
|--------|-------------|---------|
| `engage(height_from_base=None, offset=None, height=None)` | Engage magnets | None |
| `disengage()` | Disengage magnets | None |
| `load_labware(name, label=None, namespace=None, version=None)` | Load labware on module | Labware |
**Properties:**
- `status`: 'engaged' or 'disengaged'
- `labware`: Loaded labware
### Heater-Shaker Module
| Method | Description | Returns |
|--------|-------------|---------|
| `set_target_temperature(celsius)` | Set heater target | None |
| `wait_for_temperature()` | Wait for temperature | None |
| `set_and_wait_for_temperature(celsius)` | Set and wait | None |
| `deactivate_heater()` | Turn off heater | None |
| `set_and_wait_for_shake_speed(rpm)` | Set shake speed | None |
| `deactivate_shaker()` | Turn off shaker | None |
| `open_labware_latch()` | Open latch | None |
| `close_labware_latch()` | Close latch | None |
| `load_labware(name, label=None, namespace=None, version=None)` | Load labware on module | Labware |
**Properties:**
- `temperature`: Current temperature (°C)
- `target_temperature`: Target temperature (°C)
- `current_speed`: Current shake speed (rpm)
- `target_speed`: Target shake speed (rpm)
- `labware_latch_status`: 'idle_open', 'idle_closed', 'opening', 'closing'
- `status`: Module status
- `labware`: Loaded labware
### Thermocycler Module
| Method | Description | Returns |
|--------|-------------|---------|
| `open_lid()` | Open lid | None |
| `close_lid()` | Close lid | None |
| `set_lid_temperature(celsius)` | Set lid temperature | None |
| `deactivate_lid()` | Turn off lid heater | None |
| `set_block_temperature(temperature, hold_time_seconds=0, hold_time_minutes=0, ramp_rate=None, block_max_volume=None)` | Set block temperature | None |
| `deactivate_block()` | Turn off block | None |
| `execute_profile(steps, repetitions, block_max_volume=None)` | Run temperature profile | None |
| `load_labware(name, label=None, namespace=None, version=None)` | Load labware on module | Labware |
**Profile step format:**
```python
{'temperature': 95, 'hold_time_seconds': 30, 'hold_time_minutes': 0}
```
**Properties:**
- `block_temperature`: Current block temperature (°C)
- `block_target_temperature`: Target block temperature (°C)
- `lid_temperature`: Current lid temperature (°C)
- `lid_target_temperature`: Target lid temperature (°C)
- `lid_position`: 'open', 'closed', 'in_between'
- `ramp_rate`: Block temperature ramp rate (°C/s)
- `status`: Module status
- `labware`: Loaded labware
### Absorbance Plate Reader Module
| Method | Description | Returns |
|--------|-------------|---------|
| `initialize(mode, wavelengths)` | Initialize reader | None |
| `read(export_filename=None)` | Read plate | Dict |
| `close_lid()` | Close lid | None |
| `open_lid()` | Open lid | None |
| `load_labware(name, label=None, namespace=None, version=None)` | Load labware on module | Labware |
**Read modes:**
- `'single'`: Single wavelength
- `'multi'`: Multiple wavelengths
**Properties:**
- `is_lid_on`: Lid status
- `labware`: Loaded labware
## Common Labware API Names
### Plates
- `corning_96_wellplate_360ul_flat`
- `nest_96_wellplate_100ul_pcr_full_skirt`
- `nest_96_wellplate_200ul_flat`
- `biorad_96_wellplate_200ul_pcr`
- `appliedbiosystems_384_wellplate_40ul`
### Reservoirs
- `nest_12_reservoir_15ml`
- `nest_1_reservoir_195ml`
- `usascientific_12_reservoir_22ml`
### Tip Racks
**Flex:**
- `opentrons_flex_96_tiprack_50ul`
- `opentrons_flex_96_tiprack_200ul`
- `opentrons_flex_96_tiprack_1000ul`
**OT-2:**
- `opentrons_96_tiprack_20ul`
- `opentrons_96_tiprack_300ul`
- `opentrons_96_tiprack_1000ul`
### Tube Racks
- `opentrons_10_tuberack_falcon_4x50ml_6x15ml_conical`
- `opentrons_24_tuberack_nest_1.5ml_snapcap`
- `opentrons_24_tuberack_nest_1.5ml_screwcap`
- `opentrons_15_tuberack_falcon_15ml_conical`
### Adapters
- `opentrons_flex_96_tiprack_adapter`
- `opentrons_96_deep_well_adapter`
- `opentrons_aluminum_flat_bottom_plate`
## Error Handling
Common exceptions:
- `OutOfTipsError`: No tips available
- `LabwareNotLoadedError`: Labware not loaded on deck
- `InvalidContainerError`: Invalid labware specification
- `InstrumentNotLoadedError`: Pipette not loaded
- `InvalidVolumeError`: Volume out of range
## Simulation and Debugging
Check simulation status:
```python
if protocol.is_simulating():
protocol.comment('Running in simulation')
```
Access bundled data files:
```python
data_file = protocol.bundled_data['data.csv']
with open(data_file) as f:
data = f.read()
```
## Version Compatibility
API Level compatibility:
| API Level | Features |
|-----------|----------|
| 2.19 | Latest features, Flex support |
| 2.18 | Absorbance plate reader |
| 2.17 | Liquid tracking improvements |
| 2.16 | Flex 8-channel partial tip pickup |
| 2.15 | Heater-Shaker Gen1 |
| 2.13 | Temperature Module Gen2 |
| 2.0-2.12 | Core OT-2 functionality |
Always use the latest stable API version for new protocols.

View File

@@ -0,0 +1,67 @@
#!/usr/bin/env python3
"""
Basic Opentrons Protocol Template
This template provides a minimal starting point for creating Opentrons protocols.
Replace the placeholder values and add your specific protocol logic.
"""
from opentrons import protocol_api
# Metadata
metadata = {
'protocolName': 'Basic Protocol Template',
'author': 'Your Name <email@example.com>',
'description': 'A basic protocol template for Opentrons',
'apiLevel': '2.19'
}
# Requirements
requirements = {
'robotType': 'Flex', # or 'OT-2'
'apiLevel': '2.19'
}
def run(protocol: protocol_api.ProtocolContext):
"""
Main protocol function.
Args:
protocol: The protocol context provided by Opentrons
"""
# Load tip racks
tips_200 = protocol.load_labware('opentrons_flex_96_tiprack_200ul', 'D1')
# Load labware
source_plate = protocol.load_labware(
'nest_96_wellplate_200ul_flat',
'D2',
label='Source Plate'
)
dest_plate = protocol.load_labware(
'nest_96_wellplate_200ul_flat',
'D3',
label='Destination Plate'
)
# Load pipette
pipette = protocol.load_instrument(
'p300_single_flex',
'left',
tip_racks=[tips_200]
)
# Protocol commands
protocol.comment('Starting protocol...')
# Example: Transfer from A1 to B1
pipette.transfer(
volume=50,
source=source_plate['A1'],
dest=dest_plate['B1'],
new_tip='always'
)
protocol.comment('Protocol complete!')

View File

@@ -0,0 +1,154 @@
#!/usr/bin/env python3
"""
PCR Setup Protocol Template
This template demonstrates how to set up PCR reactions using the Thermocycler module.
Includes master mix distribution, sample addition, and PCR cycling.
"""
from opentrons import protocol_api
metadata = {
'protocolName': 'PCR Setup with Thermocycler',
'author': 'Opentrons',
'description': 'Automated PCR setup and cycling protocol',
'apiLevel': '2.19'
}
requirements = {
'robotType': 'Flex',
'apiLevel': '2.19'
}
def run(protocol: protocol_api.ProtocolContext):
"""
Sets up PCR reactions and runs thermocycler.
Protocol performs:
1. Distributes master mix to PCR plate
2. Adds DNA samples
3. Runs PCR cycling program
"""
# Load thermocycler module
tc_mod = protocol.load_module('thermocyclerModuleV2')
tc_plate = tc_mod.load_labware('nest_96_wellplate_100ul_pcr_full_skirt')
# Load tips and reagents
tips_20 = protocol.load_labware('opentrons_flex_96_tiprack_50ul', 'C1')
tips_200 = protocol.load_labware('opentrons_flex_96_tiprack_200ul', 'C2')
reagent_rack = protocol.load_labware(
'opentrons_24_tuberack_nest_1.5ml_snapcap',
'D1',
label='Reagents'
)
# Load pipettes
p20 = protocol.load_instrument('p50_single_flex', 'left', tip_racks=[tips_20])
p300 = protocol.load_instrument('p300_single_flex', 'right', tip_racks=[tips_200])
# Define liquids
master_mix = protocol.define_liquid(
name='PCR Master Mix',
description='2x PCR master mix',
display_color='#FFB6C1'
)
template_dna = protocol.define_liquid(
name='Template DNA',
description='DNA samples',
display_color='#90EE90'
)
# Load liquids
reagent_rack['A1'].load_liquid(liquid=master_mix, volume=1000)
for i in range(8): # 8 samples
reagent_rack.wells()[i + 1].load_liquid(liquid=template_dna, volume=50)
# PCR setup parameters
num_samples = 8
master_mix_volume = 20 # µL per reaction
template_volume = 5 # µL per reaction
total_reaction_volume = 25 # µL
protocol.comment('Starting PCR setup...')
# Open thermocycler lid
tc_mod.open_lid()
protocol.comment('Thermocycler lid opened')
# Step 1: Distribute master mix
protocol.comment(f'Distributing {master_mix_volume}µL master mix to {num_samples} wells...')
p300.distribute(
master_mix_volume,
reagent_rack['A1'],
tc_plate.wells()[:num_samples],
new_tip='once',
disposal_volume=10 # Extra volume to prevent shortage
)
# Step 2: Add template DNA
protocol.comment('Adding template DNA to each well...')
for i in range(num_samples):
p20.transfer(
template_volume,
reagent_rack.wells()[i + 1], # Sample tubes
tc_plate.wells()[i], # PCR plate wells
mix_after=(3, 10), # Mix 3x with 10µL
new_tip='always'
)
protocol.comment('PCR reactions prepared')
# Close lid and start PCR
tc_mod.close_lid()
protocol.comment('Thermocycler lid closed')
# Set lid temperature
tc_mod.set_lid_temperature(celsius=105)
protocol.comment('Lid heating to 105°C')
# Initial denaturation
protocol.comment('Initial denaturation...')
tc_mod.set_block_temperature(
temperature=95,
hold_time_seconds=180,
block_max_volume=total_reaction_volume
)
# PCR cycling profile
protocol.comment('Starting PCR cycling...')
profile = [
{'temperature': 95, 'hold_time_seconds': 15}, # Denaturation
{'temperature': 60, 'hold_time_seconds': 30}, # Annealing
{'temperature': 72, 'hold_time_seconds': 30} # Extension
]
num_cycles = 35
tc_mod.execute_profile(
steps=profile,
repetitions=num_cycles,
block_max_volume=total_reaction_volume
)
# Final extension
protocol.comment('Final extension...')
tc_mod.set_block_temperature(
temperature=72,
hold_time_minutes=5,
block_max_volume=total_reaction_volume
)
# Hold at 4°C
protocol.comment('Cooling to 4°C for storage...')
tc_mod.set_block_temperature(
temperature=4,
block_max_volume=total_reaction_volume
)
# Deactivate and open
tc_mod.deactivate_lid()
tc_mod.open_lid()
protocol.comment('PCR complete! Plate ready for removal.')
protocol.comment(f'Completed {num_cycles} cycles for {num_samples} samples')

View File

@@ -0,0 +1,96 @@
#!/usr/bin/env python3
"""
Serial Dilution Protocol Template
This template demonstrates how to perform a serial dilution across a plate row.
Useful for creating concentration gradients for assays.
"""
from opentrons import protocol_api
metadata = {
'protocolName': 'Serial Dilution Template',
'author': 'Opentrons',
'description': 'Serial dilution protocol for creating concentration gradients',
'apiLevel': '2.19'
}
requirements = {
'robotType': 'Flex',
'apiLevel': '2.19'
}
def run(protocol: protocol_api.ProtocolContext):
"""
Performs a serial dilution across plate rows.
Protocol performs:
1. Adds diluent to all wells except the first column
2. Transfers stock solution to first column
3. Performs serial dilutions across rows
"""
# Load labware
tips = protocol.load_labware('opentrons_flex_96_tiprack_200ul', 'D1')
reservoir = protocol.load_labware('nest_12_reservoir_15ml', 'D2', label='Reservoir')
plate = protocol.load_labware('corning_96_wellplate_360ul_flat', 'D3', label='Dilution Plate')
# Load pipette
p300 = protocol.load_instrument('p300_single_flex', 'left', tip_racks=[tips])
# Define liquids (optional, for visualization)
diluent = protocol.define_liquid(
name='Diluent',
description='Buffer or growth media',
display_color='#B0E0E6'
)
stock = protocol.define_liquid(
name='Stock Solution',
description='Concentrated stock',
display_color='#FF6347'
)
# Load liquids into wells
reservoir['A1'].load_liquid(liquid=diluent, volume=15000)
reservoir['A2'].load_liquid(liquid=stock, volume=5000)
# Protocol parameters
dilution_factor = 2 # 1:2 dilution
transfer_volume = 100 # µL
num_dilutions = 11 # Number of dilution steps
protocol.comment('Starting serial dilution protocol')
# Step 1: Add diluent to all wells except first column
protocol.comment('Adding diluent to wells...')
for row in plate.rows()[:8]: # For each row (A-H)
p300.transfer(
transfer_volume,
reservoir['A1'], # Diluent source
row[1:], # All wells except first (columns 2-12)
new_tip='once'
)
# Step 2: Add stock solution to first column
protocol.comment('Adding stock solution to first column...')
p300.transfer(
transfer_volume * 2, # Double volume for first well
reservoir['A2'], # Stock source
[row[0] for row in plate.rows()[:8]], # First column (wells A1-H1)
new_tip='always'
)
# Step 3: Perform serial dilution
protocol.comment('Performing serial dilutions...')
for row in plate.rows()[:8]: # For each row
p300.transfer(
transfer_volume,
row[:num_dilutions], # Source wells (1-11)
row[1:num_dilutions + 1], # Destination wells (2-12)
mix_after=(3, 50), # Mix 3x with 50µL after each transfer
new_tip='always'
)
protocol.comment('Serial dilution complete!')
protocol.comment(f'Created {num_dilutions} dilutions with {dilution_factor}x dilution factor')