feat: vector agent references

This commit is contained in:
Pedro Rodrigues
2026-01-27 23:28:08 +00:00
parent 397502ea42
commit f4d228d448
10 changed files with 764 additions and 16 deletions

View File

@@ -1,10 +1,10 @@
---
name: supabase
description: Guides and best practices for working with Supabase. Covers getting started, Auth, Database, Storage, Edge Functions, Realtime, supabase-js SDK, CLI, and MCP integration. Use for any Supabase-related questions.
description: Guides and best practices for working with Supabase. Covers getting started, Auth, Database, Vectors/AI, Storage, Edge Functions, Realtime, supabase-js SDK, CLI, and MCP integration. Use for any Supabase-related questions including vector search, embeddings, RAG, and semantic search.
license: MIT
metadata:
author: supabase
version: '1.0.0'
version: "1.0.0"
organization: Supabase
date: January 2026
abstract: Comprehensive Supabase development guide for building applications with Supabase services. Contains guides covering Auth, Database, Storage, Edge Functions, Realtime, client libraries, CLI, and tooling. Each reference includes setup instructions, code examples, common mistakes, and integration patterns.
@@ -31,25 +31,65 @@ curl -H "Accept: text/markdown" https://supabase.com/docs/<path>
Reference the appropriate resource file based on the user's needs:
### Core Guides
| Area | Resource | When to Use |
| ---------------- | -------------------------------- | -------------------------------------------------------- |
| Getting Started | `references/getting-started.md` | Setting up a project, connection strings, dependencies |
| Referencing Docs | `references/referencing-docs.md` | Looking up official documentation, verifying information |
### Authentication & Security
| Area | Resource | When to Use |
| ------------------ | -------------------- | ------------------------------------------ |
| Auth Overview | `references/auth.md` | Authentication, social login, sessions |
| Row Level Security | `references/rls.md` | Database security policies, access control |
### Database
| Area | Resource | When to Use |
| ------------------ | ------------------------------- | ---------------------------------------------- |
| RLS Security | `references/db-rls-*.md` | Row Level Security policies, common mistakes |
| Connection Pooling | `references/db-conn-pooling.md` | Transaction vs Session mode, port 6543 vs 5432 |
| Schema Design | `references/db-schema-*.md` | auth.users FKs, timestamps, JSONB, extensions |
| Migrations | `references/db-migrations-*.md` | CLI workflows, idempotent patterns, db diff |
| Performance | `references/db-perf-*.md` | Indexes (BRIN, GIN), query optimization |
| Security | `references/db-security-*.md` | Service role key, security_definer functions |
| Database | `references/database.md` | Postgres queries, migrations, modeling |
| RLS Security | `references/db/rls-*.md` | Row Level Security policies, common mistakes |
| Connection Pooling | `references/db/conn-pooling.md` | Transaction vs Session mode, port 6543 vs 5432 |
| Schema Design | `references/db/schema-*.md` | auth.users FKs, timestamps, JSONB, extensions |
| Migrations | `references/db/migrations-*.md` | CLI workflows, idempotent patterns, db diff |
| Performance | `references/db/perf-*.md` | Indexes (BRIN, GIN), query optimization |
| Security | `references/db/security-*.md` | Service role key, security_definer functions |
### Vectors & AI
| Area | Resource | When to Use |
| ------------------ | --------------------------------- | ----------------------------------------------- |
| Vector Setup | `references/vectors/setup-*.md` | pgvector extension, vector columns, dimensions |
| Vector Indexing | `references/vectors/index-*.md` | HNSW, IVFFlat, index parameters, concurrent |
| Vector Search | `references/vectors/search-*.md` | Semantic search, hybrid search, match_documents |
| Embeddings | `references/vectors/embed-*.md` | gte-small, OpenAI, triggers, Edge Functions |
| RAG | `references/vectors/rag-*.md` | Document ingestion, chunking, query pipelines |
| Vector Performance | `references/vectors/perf-*.md` | Pre-warming, compute sizing, batch operations |
### Storage & Media
| Area | Resource | When to Use |
| ------- | ----------------------- | ---------------------------- |
| Storage | `references/storage.md` | File uploads, buckets, media |
### Edge Functions
| Area | Resource | When to Use |
| -------------- | ------------------------------ | -------------------------------------------- |
| Edge Functions | `references/edge-functions.md` | Serverless functions, Deno runtime, webhooks |
### Realtime
| Area | Resource | When to Use |
| ---------------- | ------------------------------------ | ----------------------------------------------- |
| Channel Setup | `references/realtime-setup-*.md` | Creating channels, naming conventions, auth |
| Broadcast | `references/realtime-broadcast-*.md` | Client messaging, database-triggered broadcasts |
| Presence | `references/realtime-presence-*.md` | User online status, shared state tracking |
| Postgres Changes | `references/realtime-postgres-*.md` | Database change listeners (prefer Broadcast) |
| Patterns | `references/realtime-patterns-*.md` | Cleanup, error handling, React integration |
| -------- | ------------------------ | -------------------------------------------- |
| Realtime | `references/realtime.md` | Real-time subscriptions, presence, broadcast |
**CLI Usage:** Always use `npx supabase` instead of `supabase` for version consistency across team members.
### Client Libraries & CLI
| Area | Resource | When to Use |
| ------------ | --------------------------- | ---------------------------------------- |
| supabase-js | `references/supabase-js.md` | JavaScript/TypeScript SDK, client config |
| Supabase CLI | `references/cli.md` | Local development, migrations, CI/CD |
| MCP Server | `references/mcp.md` | AI agent integration, MCP tooling |

View File

@@ -0,0 +1,36 @@
# Section Definitions
Reference files are grouped by prefix. Claude loads specific files based on user
queries.
---
## 1. Setup (setup)
**Impact:** HIGH
**Description:** pgvector extension setup, vector column types (vector, halfvec, bit), dimension configuration, and Supabase-specific schema patterns.
## 2. Indexing (index)
**Impact:** CRITICAL
**Description:** HNSW and IVFFlat index creation, parameter tuning (m, ef_construction, ef_search, lists, probes), operator classes, and concurrent index builds.
## 3. Search (search)
**Impact:** CRITICAL
**Description:** Semantic search with match_documents functions, hybrid search combining vectors with full-text, RRF scoring, metadata filtering, and RLS integration.
## 4. Embeddings (embed)
**Impact:** HIGH
**Description:** Embedding generation using gte-small built-in model, OpenAI integration, automatic embeddings with triggers, and Edge Function patterns.
## 5. RAG (rag)
**Impact:** HIGH
**Description:** Retrieval-Augmented Generation patterns including document ingestion, chunking strategies, query pipelines, and Edge Function architectures.
## 6. Performance (perf)
**Impact:** CRITICAL
**Description:** Vector search optimization, index pre-warming, compute sizing for vector workloads, batch operations, and query monitoring.

View File

@@ -0,0 +1,85 @@
---
title: Generate Embeddings for Vector Search
impact: HIGH
impactDescription: Enables automatic embedding generation without external API calls
tags: embeddings, gte-small, openai, triggers, edge-functions
---
## Generate Embeddings for Vector Search
Generate embeddings using Supabase's built-in model, external APIs, or automatic triggers.
## 1. Blocking Inserts with Synchronous Embedding
Synchronous embedding generation blocks inserts and causes timeouts.
**Incorrect:**
```sql
-- Insert waits for embedding API - slow and timeout-prone
create trigger embed_sync before insert on documents
for each row execute function sync_embedding();
```
**Correct:**
```sql
-- Async with pg_net - insert completes immediately
create trigger embed_async after insert on documents
for each row execute function queue_embedding_job();
```
## 2. Not Handling Embedding Failures
Embedding APIs can fail; unhandled errors cause data loss.
**Incorrect:**
```typescript
// No error handling
const embedding = await session.run(content)
await supabase.from('docs').update({ embedding }).eq('id', id)
```
**Correct:**
```typescript
// Handle failures gracefully
try {
const embedding = await session.run(content, { mean_pool: true, normalize: true })
if (!embedding) throw new Error('Empty embedding')
await supabase.from('docs').update({ embedding }).eq('id', id)
} catch (error) {
console.error('Embedding failed:', error)
// Queue for retry or mark as failed
}
```
## Built-in gte-small Model
```typescript
// Edge Function - no external API needed
const session = new Supabase.ai.Session('gte-small')
const embedding = await session.run(input, {
mean_pool: true,
normalize: true,
})
// Returns 384-dim vector, English only, max 512 tokens
```
## OpenAI Embeddings
```typescript
const response = await openai.embeddings.create({
model: 'text-embedding-3-small',
input: text,
})
const embedding = response.data[0].embedding // 1536-dim
```
## Related
- [setup-pgvector.md](setup-pgvector.md) - Vector column setup
- [rag-patterns.md](rag-patterns.md) - Complete RAG architecture
- [Docs](https://supabase.com/docs/guides/ai/quickstarts/generate-text-embeddings) - Embeddings guide

View File

@@ -0,0 +1,75 @@
---
title: Configure HNSW Indexes for Fast Vector Search
impact: CRITICAL
impactDescription: 10-100x faster queries with proper HNSW configuration
tags: hnsw, index, m, ef_construction, ef_search, cosine, inner-product
---
## Configure HNSW Indexes for Fast Vector Search
HNSW (Hierarchical Navigable Small World) is the recommended index type for vector search in Supabase.
## 1. Mismatched Operator Class
Index operator class must match the query distance operator.
**Incorrect:**
```sql
-- Index uses cosine, query uses inner product
create index on docs using hnsw (embedding vector_cosine_ops);
select * from docs order by embedding <#> query_embedding; -- Won't use index
```
**Correct:**
```sql
-- Match operator class to distance function
create index on docs using hnsw (embedding vector_ip_ops);
select * from docs order by embedding <#> query_embedding; -- Uses index
```
## 2. Non-Concurrent Index Build
Building indexes without CONCURRENTLY locks the table.
**Incorrect:**
```sql
-- Locks table during build (bad for production)
create index on documents using hnsw (embedding vector_cosine_ops);
```
**Correct:**
```sql
-- No table lock
create index concurrently on documents using hnsw (embedding vector_cosine_ops);
```
## Distance Operators
| Operator | Name | Index Class | Best For |
|----------|------|-------------|----------|
| `<=>` | Cosine | `vector_cosine_ops` | Safe default |
| `<#>` | Negative Inner Product | `vector_ip_ops` | Normalized vectors (fastest) |
| `<->` | Euclidean | `vector_l2_ops` | Absolute distances |
**Note:** `<#>` returns negative inner product, so smaller values = more similar. Order ASC for most similar first.
## HNSW Parameters
```sql
-- Custom parameters for higher recall
create index on documents using hnsw (embedding vector_cosine_ops)
with (m = 24, ef_construction = 100);
-- Query-time tuning
set hnsw.ef_search = 100; -- Default is 40
```
## Related
- [index-ivfflat.md](index-ivfflat.md) - Alternative index for specific cases
- [perf-tuning.md](perf-tuning.md) - Pre-warming and compute sizing
- [Docs](https://supabase.com/docs/guides/ai/vector-indexes/hnsw-indexes) - HNSW guide

View File

@@ -0,0 +1,70 @@
---
title: Use IVFFlat for Large-Scale Vector Collections
impact: MEDIUM-HIGH
impactDescription: Faster index creation for 1M+ vectors with acceptable recall
tags: ivfflat, index, lists, probes, large-scale
---
## Use IVFFlat for Large-Scale Vector Collections
IVFFlat divides vectors into clusters for faster search. Use HNSW unless you have specific requirements.
## 1. Creating IVFFlat on Empty Table
IVFFlat needs data to create meaningful clusters.
**Incorrect:**
```sql
-- No data to cluster - poor index quality
create table docs (embedding vector(1536));
create index on docs using ivfflat (embedding vector_cosine_ops) with (lists = 100);
```
**Correct:**
```sql
-- Add substantial data first, then create index
create table docs (embedding vector(1536));
insert into docs (embedding) select ...; -- Add data
create index on docs using ivfflat (embedding vector_cosine_ops) with (lists = 100);
```
## 2. Setting Probes Equal to Lists
When probes equals lists, the index is not used (sequential scan).
**Incorrect:**
```sql
-- If lists = 100, this defeats the purpose of the index
set ivfflat.probes = 100;
```
**Correct:**
```sql
-- probes should be much smaller than lists
set ivfflat.probes = 10; -- ~10% of lists
```
## IVFFlat vs HNSW
| Aspect | HNSW | IVFFlat |
|--------|------|---------|
| Build speed | Slower | Faster |
| Query speed | Faster | Slower |
| Empty table | Works | Needs data |
| Updates | Handles well | May need rebuild |
## Lists Recommendations
| Rows | Lists |
|------|-------|
| < 1M | sqrt(rows) up to 1000 |
| ≥ 1M | rows/1000 up to 4000 |
## Related
- [index-hnsw.md](index-hnsw.md) - Recommended index type
- [Docs](https://supabase.com/docs/guides/ai/vector-indexes/ivf-indexes) - IVFFlat guide

View File

@@ -0,0 +1,91 @@
---
title: Optimize Vector Search Performance
impact: CRITICAL
impactDescription: 2-10x latency reduction with proper tuning
tags: performance, pre-warming, compute, batch, monitoring
---
## Optimize Vector Search Performance
Vector search is RAM-bound. Proper tuning and compute sizing are critical.
## 1. Undersized Compute for Vector Workload
Vector indexes must fit in RAM for optimal performance.
**Incorrect:**
```sql
-- Free tier (1GB RAM) with 100K 1536-dim vectors
-- Symptoms: high disk reads, slow queries
select count(*) from documents; -- Returns 100000
```
**Correct:**
```sql
-- Check buffer cache hit ratio
select round(100.0 * heap_blks_hit / nullif(heap_blks_hit + heap_blks_read, 0), 2) as hit_ratio
from pg_statio_user_tables where relname = 'documents';
-- If < 95%, upgrade compute or reduce data
```
## 2. Building Index During Peak Traffic
Non-concurrent index builds lock the table.
**Incorrect:**
```sql
-- Locks table, impacts all queries
create index on documents using hnsw (embedding vector_cosine_ops);
```
**Correct:**
```sql
-- No lock, runs in background
create index concurrently on documents using hnsw (embedding vector_cosine_ops);
```
## Compute Sizing
| Plan | RAM | Vectors (1536d) |
|------|-----|-----------------|
| Free | 1GB | ~20K |
| Small | 2GB | ~50K |
| Medium | 4GB | ~100K |
| Large | 8GB | ~250K |
## Index Pre-Warming
```sql
-- Load index into memory before production traffic
select pg_prewarm('documents_embedding_idx');
-- Run 10K-50K warm-up queries before benchmarking
```
## Index Build Settings
```sql
set maintenance_work_mem = '4GB';
set max_parallel_maintenance_workers = 4;
set statement_timeout = '0';
```
## Query Monitoring
```sql
-- Find slow vector queries
select substring(query, 1, 80), calls, round(mean_exec_time::numeric, 2) as avg_ms
from pg_stat_statements
where query like '%<=>%' or query like '%<#>%'
order by total_exec_time desc limit 10;
```
## Related
- [index-hnsw.md](index-hnsw.md) - HNSW parameters
- [Docs](https://supabase.com/docs/guides/ai/going-to-prod) - Production guide
- [Docs](https://supabase.com/docs/guides/ai/choosing-compute-addon) - Compute sizing

View File

@@ -0,0 +1,100 @@
---
title: Build RAG Applications with Supabase
impact: HIGH
impactDescription: Complete architecture for retrieval-augmented generation
tags: rag, chunking, ingestion, context-window, retrieval
---
## Build RAG Applications with Supabase
RAG (Retrieval-Augmented Generation) retrieves relevant context from vectors before sending to an LLM.
## 1. Chunks Too Large for Context Window
Large chunks consume context budget and may exceed LLM limits.
**Incorrect:**
```typescript
// 2000 token chunks × 5 results = 10K tokens - too large
const chunks = chunkText(text, 2000)
```
**Correct:**
```typescript
// Size chunks based on context budget
// GPT-4: ~128K context, reserve 4K for response
// 5 chunks × 500 tokens = 2.5K tokens for context
const chunks = chunkText(text, 500, 50) // 500 chars, 50 overlap
```
## 2. Not Preserving Chunk Metadata
Without metadata, you lose source information for citations.
**Incorrect:**
```typescript
// Lose source information
await supabase.from('chunks').insert({ content: chunk, embedding })
```
**Correct:**
```typescript
// Track source for citations
await supabase.from('document_chunks').insert({
document_id: doc.id,
chunk_index: idx,
content: chunk,
embedding,
metadata: { page: pageNum, section: sectionName },
})
```
## Document Schema
```sql
create table documents (
id bigint primary key generated always as identity,
name text not null
);
create table document_chunks (
id bigint primary key generated always as identity,
document_id bigint references documents(id) on delete cascade,
chunk_index int not null,
content text not null,
embedding extensions.vector(1536),
unique(document_id, chunk_index)
);
create index on document_chunks using hnsw (embedding vector_cosine_ops);
```
## RAG Query Pipeline
```typescript
// 1. Embed query
const queryEmbedding = await generateEmbedding(query)
// 2. Retrieve chunks
const { data: chunks } = await supabase.rpc('match_document_chunks', {
query_embedding: queryEmbedding,
match_count: 5,
})
// 3. Build context and generate response
const context = chunks.map(c => c.content).join('\n\n')
const response = await llm.chat([
{ role: 'system', content: `Answer based on:\n\n${context}` },
{ role: 'user', content: query }
])
```
## Related
- [embed-generation.md](embed-generation.md) - Generate embeddings
- [search-hybrid.md](search-hybrid.md) - Improve retrieval with hybrid search
- [Docs](https://supabase.com/docs/guides/ai/rag-with-permissions) - RAG with permissions

View File

@@ -0,0 +1,91 @@
---
title: Combine Vector and Full-Text Search with Hybrid Search
impact: HIGH
impactDescription: 20-40% relevance improvement over pure vector search
tags: hybrid-search, full-text, rrf, tsvector, metadata-filter
---
## Combine Vector and Full-Text Search with Hybrid Search
Hybrid search combines semantic similarity (vectors) with keyword matching (full-text) using Reciprocal Rank Fusion (RRF).
## 1. Missing GIN Index on tsvector
Full-text search without an index is extremely slow.
**Incorrect:**
```sql
-- No index on tsvector column
create table docs (
fts tsvector generated always as (to_tsvector('english', content)) stored
);
select * from docs where fts @@ to_tsquery('search'); -- Slow seq scan
```
**Correct:**
```sql
-- Add GIN index for full-text search
create table docs (
fts tsvector generated always as (to_tsvector('english', content)) stored
);
create index on docs using gin(fts);
select * from docs where fts @@ to_tsquery('search'); -- Fast index scan
```
## 2. Not Over-Fetching Before Fusion
Fetching exact match_count from each source may miss relevant results after fusion.
**Incorrect:**
```sql
-- May miss good results after RRF fusion
with semantic as (select id from docs order by embedding <=> query limit 5),
full_text as (select id from docs where fts @@ query limit 5)
select * from semantic union full_text limit 5;
```
**Correct:**
```sql
-- Fetch 2x from each, then fuse and limit
with semantic as (select id from docs order by embedding <=> query limit 10),
full_text as (select id from docs where fts @@ query limit 10)
-- Apply RRF scoring...
limit 5;
```
## Complete Hybrid Search Function
```sql
create function hybrid_search(
query_text text,
query_embedding vector(1536),
match_count int default 10
)
returns setof documents language sql stable security invoker as $$
with full_text as (
select id, row_number() over (order by ts_rank_cd(fts, websearch_to_tsquery(query_text)) desc) as rank_ix
from documents where fts @@ websearch_to_tsquery(query_text)
limit match_count * 2
),
semantic as (
select id, row_number() over (order by embedding <=> query_embedding) as rank_ix
from documents
order by embedding <=> query_embedding
limit match_count * 2
)
select documents.* from full_text
full outer join semantic on full_text.id = semantic.id
join documents on coalesce(full_text.id, semantic.id) = documents.id
order by coalesce(1.0/(50+full_text.rank_ix),0) + coalesce(1.0/(50+semantic.rank_ix),0) desc
limit match_count;
$$;
```
## Related
- [search-semantic.md](search-semantic.md) - Vector-only search
- [Docs](https://supabase.com/docs/guides/ai/hybrid-search) - Hybrid search guide

View File

@@ -0,0 +1,91 @@
---
title: Implement Semantic Search with match_documents
impact: CRITICAL
impactDescription: Core pattern for similarity search in Supabase applications
tags: semantic-search, match_documents, similarity, rpc, supabase-js
---
## Implement Semantic Search with match_documents
Create a PostgreSQL function to search vectors, then call it via supabase-js `.rpc()`.
## 1. Missing security invoker
Without `security invoker`, the function bypasses RLS and exposes all data.
**Incorrect:**
```sql
-- Function runs as definer, bypasses RLS
create function match_documents(query_embedding vector(1536), match_count int)
returns setof documents language sql as $$
select * from documents order by embedding <=> query_embedding limit match_count;
$$;
```
**Correct:**
```sql
-- Respects caller's RLS policies
create function match_documents(query_embedding vector(1536), match_count int)
returns setof documents language sql stable security invoker as $$
select * from documents order by embedding <=> query_embedding limit match_count;
$$;
```
## 2. Ordering by Calculated Column
Ordering by a calculated similarity column bypasses the index.
**Incorrect:**
```sql
-- Index not used
select *, 1 - (embedding <=> query) as similarity
from documents
order by similarity desc;
```
**Correct:**
```sql
-- Order by distance operator directly (uses index)
select *, 1 - (embedding <=> query) as similarity
from documents
order by embedding <=> query
limit 10;
```
## Complete match_documents Function
```sql
create or replace function match_documents (
query_embedding extensions.vector(1536),
match_threshold float default 0.78,
match_count int default 10
)
returns table (id bigint, content text, similarity float)
language sql stable security invoker
as $$
select id, content, 1 - (embedding <=> query_embedding) as similarity
from documents
where 1 - (embedding <=> query_embedding) > match_threshold
order by embedding <=> query_embedding
limit least(match_count, 200);
$$;
```
## Calling from supabase-js
```typescript
const { data } = await supabase.rpc('match_documents', {
query_embedding: embedding,
match_threshold: 0.78,
match_count: 10,
})
```
## Related
- [search-hybrid.md](search-hybrid.md) - Combine with full-text search
- [Docs](https://supabase.com/docs/guides/ai/semantic-search) - Semantic search guide

View File

@@ -0,0 +1,69 @@
---
title: Enable pgvector for Vector Storage
impact: HIGH
impactDescription: Foundation for all vector search and AI features in Supabase
tags: pgvector, vector, halfvec, dimensions, extension, embeddings
---
## Enable pgvector for Vector Storage
pgvector stores high-dimensional vectors (embeddings) in Postgres for similarity search.
## 1. Dimension Mismatch
Embedding dimensions must match between column definition and inserted data.
**Incorrect:**
```sql
-- Column is 1536, inserting 384-dim vector
create table docs (embedding extensions.vector(1536));
insert into docs (embedding) values ('[0.1, ...]'::vector(384)); -- ERROR
```
**Correct:**
```sql
-- Match dimensions to your embedding model (gte-small = 384)
create table docs (embedding extensions.vector(384));
insert into docs (embedding) values ('[0.1, ...]'::vector(384));
```
## 2. Comparing Different Embedding Models
Comparing embeddings from different models produces meaningless results.
**Incorrect:**
```sql
-- Mixing OpenAI (1536-dim) with gte-small (384-dim) embeddings
-- Even if dimensions matched, semantics differ
select * from docs order by openai_embedding <=> gte_embedding;
```
**Correct:**
```sql
-- Use one model consistently per column
create table docs (
embedding extensions.vector(1536) -- OpenAI only
);
```
## Quick Reference
| Type | Max Indexed Dims | Use Case |
|------|------------------|----------|
| `vector(n)` | 2,000 | Standard embeddings |
| `halfvec(n)` | 4,000 | Large models (3072 dims) |
| Model | Dimensions |
|-------|-----------|
| OpenAI text-embedding-3-small | 1536 |
| Supabase gte-small | 384 |
## Related
- [index-hnsw.md](index-hnsw.md) - Create indexes for fast search
- [search-semantic.md](search-semantic.md) - Query vectors with match_documents
- [Docs](https://supabase.com/docs/guides/ai/vector-columns) - Vector columns guide