mirror of https://github.com/K-Dense-AI/claude-scientific-skills.git synced 2026-03-27 07:09:27 +08:00

Files

Vinayak Agarwal f72b7f4521 Added parallel-web skill

Refactor research lookup skill to enhance backend routing and update documentation. The skill now intelligently selects between the Parallel Chat API and Perplexity sonar-pro-search based on query type. Added compatibility notes, license information, and improved descriptions for clarity. Removed outdated example scripts to streamline the codebase.

2026-03-01 07:36:19 -08:00

9.1 KiB

Raw Blame History

Search API Best Practices

Comprehensive guide to getting the best results from Parallel's Search API.

Core Concepts

The Search API returns ranked, LLM-optimized excerpts from web sources based on natural language objectives. Results are designed to serve directly as model input, enabling faster reasoning and higher-quality completions.

Key Advantages Over Traditional Search

Context engineering for token efficiency: Results are ranked by reasoning utility, not engagement
Single-hop resolution: Complex multi-topic queries resolved in one request
Multi-hop efficiency: Deep research workflows complete in fewer tool calls

Crafting Effective Search Queries

Provide Both `objective` AND `search_queries`

The objective describes your broader goal; search_queries ensures specific keywords are prioritized. Using both together gives significantly better results.

Good:

searcher.search(
    objective="I'm writing a literature review on Alzheimer's treatments. Find peer-reviewed research papers and clinical trial results from the past 2 years on amyloid-beta targeted therapies.",
    search_queries=[
        "amyloid beta clinical trials 2024-2025",
        "Alzheimer's monoclonal antibody treatment results",
        "lecanemab donanemab trial outcomes"
    ],
)

Poor:

# Too vague - no context about intent
searcher.search(objective="Alzheimer's treatment")

# Missing objective - no context for ranking
searcher.search(search_queries=["Alzheimer's drugs"])

Objective Writing Tips

State your broader task: "I'm writing a research paper on...", "I'm analyzing the market for...", "I'm preparing a presentation about..."
Be specific about source preferences: "Prefer official government websites", "Focus on peer-reviewed journals", "From major news outlets"
Include freshness requirements: "From the past 6 months", "Published in 2024-2025", "Most recent data available"
Specify content type: "Technical documentation", "Clinical trial results", "Market analysis reports", "Product announcements"

Example Objectives by Use Case

Academic Research:

"I'm writing a literature review on CRISPR gene editing applications in cancer therapy.
Find peer-reviewed papers from Nature, Science, Cell, and other high-impact journals
published in 2023-2025. Prefer clinical trial results and systematic reviews."

Market Intelligence:

"I'm preparing Q1 2025 investor materials for a fintech startup.
Find recent announcements from the Federal Reserve and SEC about digital asset
regulations and banking partnerships with crypto firms. Past 3 months only."

Technical Documentation:

"I'm designing a machine learning course. Find technical documentation and API guides
that explain how transformer attention mechanisms work, preferably from official
framework documentation like PyTorch or Hugging Face."

Current Events:

"I'm tracking AI regulation developments. Find official policy announcements,
legislative actions, and regulatory guidance from the EU, US, and UK governments
from the past month."

Search Modes

Use the mode parameter to optimize for your workflow:

Mode	Best For	Excerpt Style	Latency
`one-shot` (default)	Direct queries, single-request workflows	Comprehensive, longer	Lower
`agentic`	Multi-step reasoning loops, agent workflows	Concise, token-efficient	Slightly higher
`fast`	Real-time applications, UI auto-complete	Minimal, speed-optimized	~1 second

When to Use Each Mode

one-shot (default):

Single research question that needs comprehensive answer
Writing a section of a paper and need full context
Background research before starting a document
Any case where you'll make only one search call

agentic:

Multi-step research workflows (search → analyze → search again)
Agent loops where token efficiency matters
Iterative refinement of research queries
When integrating with other tools (search → extract → synthesize)

fast:

Live autocomplete or suggestion systems
Quick fact-checking during writing
Real-time metadata lookups
Any latency-sensitive application

Source Policy

Control which domains are included or excluded from results:

searcher.search(
    objective="Find clinical trial results for new cancer immunotherapy drugs",
    search_queries=["checkpoint inhibitor clinical trials 2025"],
    source_policy={
        "allow_domains": ["clinicaltrials.gov", "nejm.org", "thelancet.com", "nature.com"],
        "deny_domains": ["reddit.com", "quora.com"],
        "after_date": "2024-01-01"
    },
)

Source Policy Parameters

Parameter	Type	Description
`allow_domains`	list[str]	Only include results from these domains
`deny_domains`	list[str]	Exclude results from these domains
`after_date`	str (YYYY-MM-DD)	Only include content published after this date

Domain Lists by Use Case

Academic Research:

allow_domains = [
    "nature.com", "science.org", "cell.com", "thelancet.com",
    "nejm.org", "bmj.com", "pnas.org", "arxiv.org",
    "pubmed.ncbi.nlm.nih.gov", "scholar.google.com"
]

Technology/AI:

allow_domains = [
    "arxiv.org", "openai.com", "anthropic.com", "deepmind.google",
    "huggingface.co", "pytorch.org", "tensorflow.org",
    "proceedings.neurips.cc", "proceedings.mlr.press"
]

Market Intelligence:

deny_domains = [
    "reddit.com", "quora.com", "medium.com",
    "wikipedia.org"  # Good for facts, not for market data
]

Government/Policy:

allow_domains = [
    "gov", "europa.eu", "who.int", "worldbank.org",
    "imf.org", "oecd.org", "un.org"
]

Controlling Result Volume

`max_results` Parameter

Range: 1-20 (default: 10)
More results = broader coverage but more tokens to process
Fewer results = more focused but may miss relevant sources

Recommendations:

Quick fact check: max_results=3
Standard research: max_results=10 (default)
Comprehensive survey: max_results=20

Excerpt Length Control

searcher.search(
    objective="...",
    max_chars_per_result=10000,  # Default: 10000
)

Short excerpts (1000-3000): Quick summaries, metadata extraction
Medium excerpts (5000-10000): Standard research, balanced depth
Long excerpts (10000-50000): Full article content, deep analysis

Common Patterns

Pattern 1: Research Before Writing

# Before writing each section, search for relevant information
result = searcher.search(
    objective="Find recent advances in transformer attention mechanisms for a NeurIPS paper introduction",
    search_queries=["attention mechanism innovations 2024", "efficient transformers"],
    max_results=10,
)

# Extract key findings for the section
for r in result["results"]:
    print(f"Source: {r['title']} ({r['url']})")
    # Use excerpts to inform writing

Pattern 2: Fact Verification

# Quick verification of a specific claim
result = searcher.search(
    objective="Verify: Did GPT-4 achieve 86.4% on MMLU benchmark?",
    search_queries=["GPT-4 MMLU benchmark score"],
    max_results=5,
)

Pattern 3: Competitive Intelligence

result = searcher.search(
    objective="Find recent product launches and funding announcements for AI coding assistants in 2025",
    search_queries=[
        "AI coding assistant funding 2025",
        "code generation tool launch",
        "AI developer tools new product"
    ],
    source_policy={"after_date": "2025-01-01"},
    max_results=15,
)

Pattern 4: Multi-Language Research

# Search includes multilingual results automatically
result = searcher.search(
    objective="Find global perspectives on AI regulation, including EU, China, and US approaches",
    search_queries=[
        "EU AI Act implementation 2025",
        "China AI regulation policy",
        "US AI executive order updates"
    ],
)

Troubleshooting

Few or No Results

Broaden your objective: Remove overly specific constraints
Add more search queries: Different phrasings of the same concept
Remove source policy: Domain restrictions may be too narrow
Check date filters: after_date may be too recent

Irrelevant Results

Make objective more specific: Add context about your task
Use source policy: Allow only authoritative domains
Add negative context: "Not about [unrelated topic]"
Refine search queries: Use more precise keywords

Too Many Tokens in Results

Reduce max_results: From 10 to 5 or 3
Reduce excerpt length: Lower max_chars_per_result
Use agentic mode: More concise excerpts
Use fast mode: Minimal excerpts

9.1 KiB Raw Blame History