mirror of
https://github.com/K-Dense-AI/claude-scientific-skills.git
synced 2026-03-28 07:33:45 +08:00
650 lines
17 KiB
Markdown
650 lines
17 KiB
Markdown
# LLM Provider Configuration Guide
|
|
|
|
This document provides comprehensive configuration instructions for all LLM providers supported by Biomni.
|
|
|
|
## Overview
|
|
|
|
Biomni supports multiple LLM providers through a unified interface. Configure providers using:
|
|
- Environment variables
|
|
- `.env` files
|
|
- Runtime configuration via `default_config`
|
|
|
|
## Quick Reference Table
|
|
|
|
| Provider | Recommended For | API Key Required | Cost | Setup Complexity |
|
|
|----------|----------------|------------------|------|------------------|
|
|
| Anthropic Claude | Most biomedical tasks | Yes | Medium | Easy |
|
|
| OpenAI | General tasks | Yes | Medium-High | Easy |
|
|
| Azure OpenAI | Enterprise deployment | Yes | Varies | Medium |
|
|
| Google Gemini | Multimodal tasks | Yes | Medium | Easy |
|
|
| Groq | Fast inference | Yes | Low | Easy |
|
|
| Ollama | Local/offline use | No | Free | Medium |
|
|
| AWS Bedrock | AWS ecosystem | Yes | Varies | Hard |
|
|
| Biomni-R0 | Complex biological reasoning | No | Free | Hard |
|
|
|
|
## Anthropic Claude (Recommended)
|
|
|
|
### Overview
|
|
|
|
Claude models from Anthropic provide excellent biological reasoning capabilities and are the recommended choice for most Biomni tasks.
|
|
|
|
### Setup
|
|
|
|
1. **Obtain API Key:**
|
|
- Sign up at https://console.anthropic.com/
|
|
- Navigate to API Keys section
|
|
- Generate a new key
|
|
|
|
2. **Configure Environment:**
|
|
|
|
**Option A: Environment Variable**
|
|
```bash
|
|
export ANTHROPIC_API_KEY="sk-ant-api03-..."
|
|
```
|
|
|
|
**Option B: .env File**
|
|
```bash
|
|
# .env file in project root
|
|
ANTHROPIC_API_KEY=sk-ant-api03-...
|
|
```
|
|
|
|
3. **Set Model in Code:**
|
|
```python
|
|
from biomni.config import default_config
|
|
|
|
# Claude Sonnet 4 (Recommended)
|
|
default_config.llm = "claude-sonnet-4-20250514"
|
|
|
|
# Claude Opus 4 (Most capable)
|
|
default_config.llm = "claude-opus-4-20250514"
|
|
|
|
# Claude 3.5 Sonnet (Previous version)
|
|
default_config.llm = "claude-3-5-sonnet-20241022"
|
|
```
|
|
|
|
### Available Models
|
|
|
|
| Model | Context Window | Strengths | Best For |
|
|
|-------|---------------|-----------|----------|
|
|
| `claude-sonnet-4-20250514` | 200K tokens | Balanced performance, cost-effective | Most biomedical tasks |
|
|
| `claude-opus-4-20250514` | 200K tokens | Highest capability, complex reasoning | Difficult multi-step analyses |
|
|
| `claude-3-5-sonnet-20241022` | 200K tokens | Fast, reliable | Standard workflows |
|
|
| `claude-3-opus-20240229` | 200K tokens | Strong reasoning | Legacy support |
|
|
|
|
### Advanced Configuration
|
|
|
|
```python
|
|
from biomni.config import default_config
|
|
|
|
# Use Claude with custom parameters
|
|
default_config.llm = "claude-sonnet-4-20250514"
|
|
default_config.timeout_seconds = 1800
|
|
|
|
# Optional: Custom API endpoint (for proxy/enterprise)
|
|
default_config.api_base = "https://your-proxy.com/v1"
|
|
```
|
|
|
|
### Cost Estimation
|
|
|
|
Approximate costs per 1M tokens (as of January 2025):
|
|
- Input: $3-15 depending on model
|
|
- Output: $15-75 depending on model
|
|
|
|
For a typical biomedical analysis (~50K tokens total): $0.50-$2.00
|
|
|
|
## OpenAI
|
|
|
|
### Overview
|
|
|
|
OpenAI's GPT models provide strong general capabilities suitable for diverse biomedical tasks.
|
|
|
|
### Setup
|
|
|
|
1. **Obtain API Key:**
|
|
- Sign up at https://platform.openai.com/
|
|
- Navigate to API Keys
|
|
- Create new secret key
|
|
|
|
2. **Configure Environment:**
|
|
|
|
```bash
|
|
export OPENAI_API_KEY="sk-proj-..."
|
|
```
|
|
|
|
Or in `.env`:
|
|
```
|
|
OPENAI_API_KEY=sk-proj-...
|
|
```
|
|
|
|
3. **Set Model:**
|
|
```python
|
|
from biomni.config import default_config
|
|
|
|
default_config.llm = "gpt-4o" # Recommended
|
|
# default_config.llm = "gpt-4" # Previous flagship
|
|
# default_config.llm = "gpt-4-turbo" # Fast variant
|
|
# default_config.llm = "gpt-3.5-turbo" # Budget option
|
|
```
|
|
|
|
### Available Models
|
|
|
|
| Model | Context Window | Strengths | Cost |
|
|
|-------|---------------|-----------|------|
|
|
| `gpt-4o` | 128K tokens | Fast, multimodal | Medium |
|
|
| `gpt-4-turbo` | 128K tokens | Fast inference | Medium |
|
|
| `gpt-4` | 8K tokens | Reliable | High |
|
|
| `gpt-3.5-turbo` | 16K tokens | Fast, cheap | Low |
|
|
|
|
### Cost Optimization
|
|
|
|
```python
|
|
# For exploratory analysis (budget-conscious)
|
|
default_config.llm = "gpt-3.5-turbo"
|
|
|
|
# For production analysis (quality-focused)
|
|
default_config.llm = "gpt-4o"
|
|
```
|
|
|
|
## Azure OpenAI
|
|
|
|
### Overview
|
|
|
|
Azure-hosted OpenAI models for enterprise users requiring data residency and compliance.
|
|
|
|
### Setup
|
|
|
|
1. **Azure Prerequisites:**
|
|
- Active Azure subscription
|
|
- Azure OpenAI resource created
|
|
- Model deployment configured
|
|
|
|
2. **Environment Variables:**
|
|
```bash
|
|
export AZURE_OPENAI_API_KEY="your-key"
|
|
export AZURE_OPENAI_ENDPOINT="https://your-resource.openai.azure.com/"
|
|
export AZURE_OPENAI_API_VERSION="2024-02-15-preview"
|
|
```
|
|
|
|
3. **Configuration:**
|
|
```python
|
|
from biomni.config import default_config
|
|
|
|
# Option 1: Use deployment name
|
|
default_config.llm = "azure/your-deployment-name"
|
|
|
|
# Option 2: Specify endpoint explicitly
|
|
default_config.llm = "azure/gpt-4"
|
|
default_config.api_base = "https://your-resource.openai.azure.com/"
|
|
```
|
|
|
|
### Deployment Setup
|
|
|
|
Azure OpenAI requires explicit model deployments:
|
|
|
|
1. Navigate to Azure OpenAI Studio
|
|
2. Create deployment for desired model (e.g., GPT-4)
|
|
3. Note the deployment name
|
|
4. Use deployment name in Biomni configuration
|
|
|
|
### Example Configuration
|
|
|
|
```python
|
|
from biomni.config import default_config
|
|
import os
|
|
|
|
# Set Azure credentials
|
|
os.environ['AZURE_OPENAI_API_KEY'] = 'your-key'
|
|
os.environ['AZURE_OPENAI_ENDPOINT'] = 'https://your-resource.openai.azure.com/'
|
|
|
|
# Configure Biomni to use Azure deployment
|
|
default_config.llm = "azure/gpt-4-biomni" # Your deployment name
|
|
default_config.api_base = os.environ['AZURE_OPENAI_ENDPOINT']
|
|
```
|
|
|
|
## Google Gemini
|
|
|
|
### Overview
|
|
|
|
Google's Gemini models offer multimodal capabilities and competitive performance.
|
|
|
|
### Setup
|
|
|
|
1. **Obtain API Key:**
|
|
- Visit https://makersuite.google.com/app/apikey
|
|
- Create new API key
|
|
|
|
2. **Environment Configuration:**
|
|
```bash
|
|
export GEMINI_API_KEY="your-key"
|
|
```
|
|
|
|
3. **Set Model:**
|
|
```python
|
|
from biomni.config import default_config
|
|
|
|
default_config.llm = "gemini/gemini-1.5-pro"
|
|
# Or: default_config.llm = "gemini/gemini-pro"
|
|
```
|
|
|
|
### Available Models
|
|
|
|
| Model | Context Window | Strengths |
|
|
|-------|---------------|-----------|
|
|
| `gemini/gemini-1.5-pro` | 1M tokens | Very large context, multimodal |
|
|
| `gemini/gemini-pro` | 32K tokens | Balanced performance |
|
|
|
|
### Use Cases
|
|
|
|
Gemini excels at:
|
|
- Tasks requiring very large context windows
|
|
- Multimodal analysis (when incorporating images)
|
|
- Cost-effective alternative to GPT-4
|
|
|
|
```python
|
|
# For tasks with large context requirements
|
|
default_config.llm = "gemini/gemini-1.5-pro"
|
|
default_config.timeout_seconds = 2400 # May need longer timeout
|
|
```
|
|
|
|
## Groq
|
|
|
|
### Overview
|
|
|
|
Groq provides ultra-fast inference with open-source models, ideal for rapid iteration.
|
|
|
|
### Setup
|
|
|
|
1. **Get API Key:**
|
|
- Sign up at https://console.groq.com/
|
|
- Generate API key
|
|
|
|
2. **Configure:**
|
|
```bash
|
|
export GROQ_API_KEY="gsk_..."
|
|
```
|
|
|
|
3. **Set Model:**
|
|
```python
|
|
from biomni.config import default_config
|
|
|
|
default_config.llm = "groq/llama-3.1-70b-versatile"
|
|
# Or: default_config.llm = "groq/mixtral-8x7b-32768"
|
|
```
|
|
|
|
### Available Models
|
|
|
|
| Model | Context Window | Speed | Quality |
|
|
|-------|---------------|-------|---------|
|
|
| `groq/llama-3.1-70b-versatile` | 32K tokens | Very Fast | Good |
|
|
| `groq/mixtral-8x7b-32768` | 32K tokens | Very Fast | Good |
|
|
| `groq/llama-3-70b-8192` | 8K tokens | Ultra Fast | Moderate |
|
|
|
|
### Best Practices
|
|
|
|
```python
|
|
# For rapid prototyping and testing
|
|
default_config.llm = "groq/llama-3.1-70b-versatile"
|
|
default_config.timeout_seconds = 600 # Groq is fast
|
|
|
|
# Note: Quality may be lower than GPT-4/Claude for complex tasks
|
|
# Recommended for: QC, simple analyses, testing workflows
|
|
```
|
|
|
|
## Ollama (Local Deployment)
|
|
|
|
### Overview
|
|
|
|
Run LLMs entirely locally for offline use, data privacy, or cost savings.
|
|
|
|
### Setup
|
|
|
|
1. **Install Ollama:**
|
|
```bash
|
|
# macOS/Linux
|
|
curl -fsSL https://ollama.com/install.sh | sh
|
|
|
|
# Or download from https://ollama.com/download
|
|
```
|
|
|
|
2. **Pull Models:**
|
|
```bash
|
|
ollama pull llama3 # Meta Llama 3 (8B)
|
|
ollama pull mixtral # Mixtral (47B)
|
|
ollama pull codellama # Code-specialized
|
|
ollama pull medllama # Medical domain (if available)
|
|
```
|
|
|
|
3. **Start Ollama Server:**
|
|
```bash
|
|
ollama serve # Runs on http://localhost:11434
|
|
```
|
|
|
|
4. **Configure Biomni:**
|
|
```python
|
|
from biomni.config import default_config
|
|
|
|
default_config.llm = "ollama/llama3"
|
|
default_config.api_base = "http://localhost:11434"
|
|
```
|
|
|
|
### Hardware Requirements
|
|
|
|
Minimum recommendations:
|
|
- **8B models:** 16GB RAM, CPU inference acceptable
|
|
- **70B models:** 64GB RAM, GPU highly recommended
|
|
- **Storage:** 5-50GB per model
|
|
|
|
### Model Selection
|
|
|
|
```python
|
|
# Fast, local, good for testing
|
|
default_config.llm = "ollama/llama3"
|
|
|
|
# Better quality (requires more resources)
|
|
default_config.llm = "ollama/mixtral"
|
|
|
|
# Code generation tasks
|
|
default_config.llm = "ollama/codellama"
|
|
```
|
|
|
|
### Advantages & Limitations
|
|
|
|
**Advantages:**
|
|
- Complete data privacy
|
|
- No API costs
|
|
- Offline operation
|
|
- Unlimited usage
|
|
|
|
**Limitations:**
|
|
- Lower quality than GPT-4/Claude for complex tasks
|
|
- Requires significant hardware
|
|
- Slower inference (especially on CPU)
|
|
- May struggle with specialized biomedical knowledge
|
|
|
|
## AWS Bedrock
|
|
|
|
### Overview
|
|
|
|
AWS-managed LLM service offering multiple model providers.
|
|
|
|
### Setup
|
|
|
|
1. **AWS Prerequisites:**
|
|
- AWS account with Bedrock access
|
|
- Model access enabled in Bedrock console
|
|
- AWS credentials configured
|
|
|
|
2. **Configure AWS Credentials:**
|
|
```bash
|
|
# Option 1: AWS CLI
|
|
aws configure
|
|
|
|
# Option 2: Environment variables
|
|
export AWS_ACCESS_KEY_ID="your-key"
|
|
export AWS_SECRET_ACCESS_KEY="your-secret"
|
|
export AWS_REGION="us-east-1"
|
|
```
|
|
|
|
3. **Enable Model Access:**
|
|
- Navigate to AWS Bedrock console
|
|
- Request access to desired models
|
|
- Wait for approval (may take hours/days)
|
|
|
|
4. **Configure Biomni:**
|
|
```python
|
|
from biomni.config import default_config
|
|
|
|
default_config.llm = "bedrock/anthropic.claude-3-sonnet"
|
|
# Or: default_config.llm = "bedrock/anthropic.claude-v2"
|
|
```
|
|
|
|
### Available Models
|
|
|
|
Bedrock provides access to:
|
|
- Anthropic Claude models
|
|
- Amazon Titan models
|
|
- AI21 Jurassic models
|
|
- Cohere Command models
|
|
- Meta Llama models
|
|
|
|
### IAM Permissions
|
|
|
|
Required IAM policy:
|
|
```json
|
|
{
|
|
"Version": "2012-10-17",
|
|
"Statement": [
|
|
{
|
|
"Effect": "Allow",
|
|
"Action": [
|
|
"bedrock:InvokeModel",
|
|
"bedrock:InvokeModelWithResponseStream"
|
|
],
|
|
"Resource": "arn:aws:bedrock:*::foundation-model/*"
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
### Example Configuration
|
|
|
|
```python
|
|
from biomni.config import default_config
|
|
import boto3
|
|
|
|
# Verify AWS credentials
|
|
session = boto3.Session()
|
|
credentials = session.get_credentials()
|
|
print(f"AWS Access Key: {credentials.access_key[:8]}...")
|
|
|
|
# Configure Biomni
|
|
default_config.llm = "bedrock/anthropic.claude-3-sonnet"
|
|
default_config.timeout_seconds = 1800
|
|
```
|
|
|
|
## Biomni-R0 (Local Specialized Model)
|
|
|
|
### Overview
|
|
|
|
Biomni-R0 is a 32B parameter reasoning model specifically trained for biological problem-solving. Provides the highest quality for complex biomedical reasoning but requires local deployment.
|
|
|
|
### Setup
|
|
|
|
1. **Hardware Requirements:**
|
|
- GPU with 48GB+ VRAM (e.g., A100, H100)
|
|
- Or multi-GPU setup (2x 24GB)
|
|
- 100GB+ storage for model weights
|
|
|
|
2. **Install Dependencies:**
|
|
```bash
|
|
pip install "sglang[all]"
|
|
pip install flashinfer # Optional but recommended
|
|
```
|
|
|
|
3. **Deploy Model:**
|
|
```bash
|
|
python -m sglang.launch_server \
|
|
--model-path snap-stanford/biomni-r0 \
|
|
--host 0.0.0.0 \
|
|
--port 30000 \
|
|
--trust-remote-code \
|
|
--mem-fraction-static 0.8
|
|
```
|
|
|
|
For multi-GPU:
|
|
```bash
|
|
python -m sglang.launch_server \
|
|
--model-path snap-stanford/biomni-r0 \
|
|
--host 0.0.0.0 \
|
|
--port 30000 \
|
|
--trust-remote-code \
|
|
--tp 2 # Tensor parallelism across 2 GPUs
|
|
```
|
|
|
|
4. **Configure Biomni:**
|
|
```python
|
|
from biomni.config import default_config
|
|
|
|
default_config.llm = "openai/biomni-r0"
|
|
default_config.api_base = "http://localhost:30000/v1"
|
|
default_config.timeout_seconds = 2400 # Longer for complex reasoning
|
|
```
|
|
|
|
### When to Use Biomni-R0
|
|
|
|
Biomni-R0 excels at:
|
|
- Multi-step biological reasoning
|
|
- Complex experimental design
|
|
- Hypothesis generation and evaluation
|
|
- Literature-informed analysis
|
|
- Tasks requiring deep biological knowledge
|
|
|
|
```python
|
|
# For complex biological reasoning tasks
|
|
default_config.llm = "openai/biomni-r0"
|
|
|
|
agent.go("""
|
|
Design a comprehensive CRISPR screening experiment to identify synthetic
|
|
lethal interactions with TP53 mutations in cancer cells, including:
|
|
1. Rationale and hypothesis
|
|
2. Guide RNA library design strategy
|
|
3. Experimental controls
|
|
4. Statistical analysis plan
|
|
5. Expected outcomes and validation approach
|
|
""")
|
|
```
|
|
|
|
### Performance Comparison
|
|
|
|
| Model | Speed | Biological Reasoning | Code Quality | Cost |
|
|
|-------|-------|---------------------|--------------|------|
|
|
| GPT-4 | Fast | Good | Excellent | Medium |
|
|
| Claude Sonnet 4 | Fast | Excellent | Excellent | Medium |
|
|
| Biomni-R0 | Moderate | Outstanding | Good | Free (local) |
|
|
|
|
## Multi-Provider Strategy
|
|
|
|
### Intelligent Model Selection
|
|
|
|
Use different models for different task types:
|
|
|
|
```python
|
|
from biomni.agent import A1
|
|
from biomni.config import default_config
|
|
|
|
# Strategy 1: Task-based selection
|
|
def get_agent_for_task(task_complexity):
|
|
if task_complexity == "simple":
|
|
default_config.llm = "gpt-3.5-turbo"
|
|
default_config.timeout_seconds = 300
|
|
elif task_complexity == "medium":
|
|
default_config.llm = "claude-sonnet-4-20250514"
|
|
default_config.timeout_seconds = 1200
|
|
else: # complex
|
|
default_config.llm = "openai/biomni-r0"
|
|
default_config.timeout_seconds = 2400
|
|
|
|
return A1(path='./data')
|
|
|
|
# Strategy 2: Fallback on failure
|
|
def execute_with_fallback(task):
|
|
models = [
|
|
"claude-sonnet-4-20250514",
|
|
"gpt-4o",
|
|
"claude-opus-4-20250514"
|
|
]
|
|
|
|
for model in models:
|
|
try:
|
|
default_config.llm = model
|
|
agent = A1(path='./data')
|
|
agent.go(task)
|
|
return
|
|
except Exception as e:
|
|
print(f"Failed with {model}: {e}, trying next...")
|
|
|
|
raise Exception("All models failed")
|
|
```
|
|
|
|
### Cost Optimization Strategy
|
|
|
|
```python
|
|
# Phase 1: Rapid prototyping with cheap models
|
|
default_config.llm = "gpt-3.5-turbo"
|
|
agent.go("Quick exploratory analysis of dataset structure")
|
|
|
|
# Phase 2: Detailed analysis with high-quality models
|
|
default_config.llm = "claude-sonnet-4-20250514"
|
|
agent.go("Comprehensive differential expression analysis with pathway enrichment")
|
|
|
|
# Phase 3: Complex reasoning with specialized models
|
|
default_config.llm = "openai/biomni-r0"
|
|
agent.go("Generate biological hypotheses based on multi-omics integration")
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### Common Issues
|
|
|
|
**Issue: "API key not found"**
|
|
- Verify environment variable is set: `echo $ANTHROPIC_API_KEY`
|
|
- Check `.env` file exists and is in correct location
|
|
- Try setting key programmatically: `os.environ['ANTHROPIC_API_KEY'] = 'key'`
|
|
|
|
**Issue: "Rate limit exceeded"**
|
|
- Implement exponential backoff and retry
|
|
- Upgrade API tier if available
|
|
- Switch to alternative provider temporarily
|
|
|
|
**Issue: "Model not found"**
|
|
- Verify model identifier is correct
|
|
- Check API key has access to requested model
|
|
- For Azure: ensure deployment exists with exact name
|
|
|
|
**Issue: "Timeout errors"**
|
|
- Increase `default_config.timeout_seconds`
|
|
- Break complex tasks into smaller steps
|
|
- Consider using faster model for initial phases
|
|
|
|
**Issue: "Connection refused (Ollama/Biomni-R0)"**
|
|
- Verify local server is running
|
|
- Check port is not blocked by firewall
|
|
- Confirm `api_base` URL is correct
|
|
|
|
### Testing Configuration
|
|
|
|
```python
|
|
from biomni.utils import list_available_models, validate_environment
|
|
|
|
# Check environment setup
|
|
status = validate_environment()
|
|
print("Environment Status:", status)
|
|
|
|
# List available models based on configured keys
|
|
models = list_available_models()
|
|
print("Available Models:", models)
|
|
|
|
# Test specific model
|
|
try:
|
|
from biomni.agent import A1
|
|
agent = A1(path='./data', llm='claude-sonnet-4-20250514')
|
|
agent.go("Print 'Configuration successful!'")
|
|
except Exception as e:
|
|
print(f"Configuration test failed: {e}")
|
|
```
|
|
|
|
## Best Practices Summary
|
|
|
|
1. **For most users:** Start with Claude Sonnet 4 or GPT-4o
|
|
2. **For cost sensitivity:** Use GPT-3.5-turbo for exploration, Claude Sonnet 4 for production
|
|
3. **For privacy/offline:** Deploy Ollama locally
|
|
4. **For complex reasoning:** Use Biomni-R0 if hardware available
|
|
5. **For enterprise:** Consider Azure OpenAI or AWS Bedrock
|
|
6. **For speed:** Use Groq for rapid iteration
|
|
|
|
7. **Always:**
|
|
- Set appropriate timeouts
|
|
- Implement error handling and retries
|
|
- Log model and configuration for reproducibility
|
|
- Test configuration before production use
|