claude-scientific-skills/scientific-skills/parallel-web/references/search_best_practices.md

# Search API Best Practices

Comprehensive guide to getting the best results from Parallel's Search API.

---

## Core Concepts

The Search API returns ranked, LLM-optimized excerpts from web sources based on natural language objectives. Results are designed to serve directly as model input, enabling faster reasoning and higher-quality completions.

### Key Advantages Over Traditional Search

- **Context engineering for token efficiency**: Results are ranked by reasoning utility, not engagement
- **Single-hop resolution**: Complex multi-topic queries resolved in one request
- **Multi-hop efficiency**: Deep research workflows complete in fewer tool calls

---

## Crafting Effective Search Queries

### Provide Both `objective` AND `search_queries`

The `objective` describes your broader goal; `search_queries` ensures specific keywords are prioritized. Using both together gives significantly better results.

**Good:**
```python
searcher.search(
    objective="I'm writing a literature review on Alzheimer's treatments. Find peer-reviewed research papers and clinical trial results from the past 2 years on amyloid-beta targeted therapies.",
    search_queries=[
        "amyloid beta clinical trials 2024-2025",
        "Alzheimer's monoclonal antibody treatment results",
        "lecanemab donanemab trial outcomes"
    ],
)
```

**Poor:**
```python
# Too vague - no context about intent
searcher.search(objective="Alzheimer's treatment")

# Missing objective - no context for ranking
searcher.search(search_queries=["Alzheimer's drugs"])
```

### Objective Writing Tips

1. **State your broader task**: "I'm writing a research paper on...", "I'm analyzing the market for...", "I'm preparing a presentation about..."
2. **Be specific about source preferences**: "Prefer official government websites", "Focus on peer-reviewed journals", "From major news outlets"
3. **Include freshness requirements**: "From the past 6 months", "Published in 2024-2025", "Most recent data available"
4. **Specify content type**: "Technical documentation", "Clinical trial results", "Market analysis reports", "Product announcements"

### Example Objectives by Use Case

**Academic Research:**
```
"I'm writing a literature review on CRISPR gene editing applications in cancer therapy.
Find peer-reviewed papers from Nature, Science, Cell, and other high-impact journals
published in 2023-2025. Prefer clinical trial results and systematic reviews."
```

**Market Intelligence:**
```
"I'm preparing Q1 2025 investor materials for a fintech startup.
Find recent announcements from the Federal Reserve and SEC about digital asset
regulations and banking partnerships with crypto firms. Past 3 months only."
```

**Technical Documentation:**
```
"I'm designing a machine learning course. Find technical documentation and API guides
that explain how transformer attention mechanisms work, preferably from official
framework documentation like PyTorch or Hugging Face."
```

**Current Events:**
```
"I'm tracking AI regulation developments. Find official policy announcements,
legislative actions, and regulatory guidance from the EU, US, and UK governments
from the past month."
```

---

## Search Modes

Use the `mode` parameter to optimize for your workflow:

| Mode | Best For | Excerpt Style | Latency |
|------|----------|---------------|---------|
| `one-shot` (default) | Direct queries, single-request workflows | Comprehensive, longer | Lower |
| `agentic` | Multi-step reasoning loops, agent workflows | Concise, token-efficient | Slightly higher |
| `fast` | Real-time applications, UI auto-complete | Minimal, speed-optimized | ~1 second |

### When to Use Each Mode

**`one-shot`** (default):
- Single research question that needs comprehensive answer
- Writing a section of a paper and need full context
- Background research before starting a document
- Any case where you'll make only one search call

**`agentic`**:
- Multi-step research workflows (search → analyze → search again)
- Agent loops where token efficiency matters
- Iterative refinement of research queries
- When integrating with other tools (search → extract → synthesize)

**`fast`**:
- Live autocomplete or suggestion systems
- Quick fact-checking during writing
- Real-time metadata lookups
- Any latency-sensitive application

---

## Source Policy

Control which domains are included or excluded from results:

```python
searcher.search(
    objective="Find clinical trial results for new cancer immunotherapy drugs",
    search_queries=["checkpoint inhibitor clinical trials 2025"],
    source_policy={
        "allow_domains": ["clinicaltrials.gov", "nejm.org", "thelancet.com", "nature.com"],
        "deny_domains": ["reddit.com", "quora.com"],
        "after_date": "2024-01-01"
    },
)
```

### Source Policy Parameters

| Parameter | Type | Description |
|-----------|------|-------------|
| `allow_domains` | list[str] | Only include results from these domains |
| `deny_domains` | list[str] | Exclude results from these domains |
| `after_date` | str (YYYY-MM-DD) | Only include content published after this date |

### Domain Lists by Use Case

**Academic Research:**
```python
allow_domains = [
    "nature.com", "science.org", "cell.com", "thelancet.com",
    "nejm.org", "bmj.com", "pnas.org", "arxiv.org",
    "pubmed.ncbi.nlm.nih.gov", "scholar.google.com"
]
```

**Technology/AI:**
```python
allow_domains = [
    "arxiv.org", "openai.com", "anthropic.com", "deepmind.google",
    "huggingface.co", "pytorch.org", "tensorflow.org",
    "proceedings.neurips.cc", "proceedings.mlr.press"
]
```

**Market Intelligence:**
```python
deny_domains = [
    "reddit.com", "quora.com", "medium.com",
    "wikipedia.org"  # Good for facts, not for market data
]
```

**Government/Policy:**
```python
allow_domains = [
    "gov", "europa.eu", "who.int", "worldbank.org",
    "imf.org", "oecd.org", "un.org"
]
```

---

## Controlling Result Volume

### `max_results` Parameter

- Range: 1-20 (default: 10)
- More results = broader coverage but more tokens to process
- Fewer results = more focused but may miss relevant sources

**Recommendations:**
- Quick fact check: `max_results=3`
- Standard research: `max_results=10` (default)
- Comprehensive survey: `max_results=20`

### Excerpt Length Control

```python
searcher.search(
    objective="...",
    max_chars_per_result=10000,  # Default: 10000
)
```

- **Short excerpts (1000-3000)**: Quick summaries, metadata extraction
- **Medium excerpts (5000-10000)**: Standard research, balanced depth
- **Long excerpts (10000-50000)**: Full article content, deep analysis

---

## Common Patterns

### Pattern 1: Research Before Writing

```python
# Before writing each section, search for relevant information
result = searcher.search(
    objective="Find recent advances in transformer attention mechanisms for a NeurIPS paper introduction",
    search_queries=["attention mechanism innovations 2024", "efficient transformers"],
    max_results=10,
)

# Extract key findings for the section
for r in result["results"]:
    print(f"Source: {r['title']} ({r['url']})")
    # Use excerpts to inform writing
```

### Pattern 2: Fact Verification

```python
# Quick verification of a specific claim
result = searcher.search(
    objective="Verify: Did GPT-4 achieve 86.4% on MMLU benchmark?",
    search_queries=["GPT-4 MMLU benchmark score"],
    max_results=5,
)
```

### Pattern 3: Competitive Intelligence

```python
result = searcher.search(
    objective="Find recent product launches and funding announcements for AI coding assistants in 2025",
    search_queries=[
        "AI coding assistant funding 2025",
        "code generation tool launch",
        "AI developer tools new product"
    ],
    source_policy={"after_date": "2025-01-01"},
    max_results=15,
)
```

### Pattern 4: Multi-Language Research

```python
# Search includes multilingual results automatically
result = searcher.search(
    objective="Find global perspectives on AI regulation, including EU, China, and US approaches",
    search_queries=[
        "EU AI Act implementation 2025",
        "China AI regulation policy",
        "US AI executive order updates"
    ],
)
```

---

## Troubleshooting

### Few or No Results

- **Broaden your objective**: Remove overly specific constraints
- **Add more search queries**: Different phrasings of the same concept
- **Remove source policy**: Domain restrictions may be too narrow
- **Check date filters**: `after_date` may be too recent

### Irrelevant Results

- **Make objective more specific**: Add context about your task
- **Use source policy**: Allow only authoritative domains
- **Add negative context**: "Not about [unrelated topic]"
- **Refine search queries**: Use more precise keywords

### Too Many Tokens in Results

- **Reduce `max_results`**: From 10 to 5 or 3
- **Reduce excerpt length**: Lower `max_chars_per_result`
- **Use `agentic` mode**: More concise excerpts
- **Use `fast` mode**: Minimal excerpts

---

## See Also

- [API Reference](api_reference.md) - Complete API parameter reference
- [Deep Research Guide](deep_research_guide.md) - For comprehensive research tasks
- [Extraction Patterns](extraction_patterns.md) - For reading specific URLs
- [Workflow Recipes](workflow_recipes.md) - Common multi-step patterns