supabase-postgres-best-practices/skills/supabase/references/vectors-perf-tuning.md at 6bbc7d3fd28acc0de9c5d27fa81d4f6043b7d02b

mirror of https://github.com/supabase/agent-skills.git synced 2026-03-27 10:09:26 +08:00

Files

Pedro Rodrigues 6bbc7d3fd2 refactor: flatten vectors references to root references directory

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-02-09 19:28:53 +00:00

2.4 KiB

Raw Blame History

title, impact, impactDescription, tags

title	impact	impactDescription	tags
Optimize Vector Search Performance	CRITICAL	2-10x latency reduction with proper tuning	performance, pre-warming, compute, batch, monitoring

Optimize Vector Search Performance

Vector search is RAM-bound. Proper tuning and compute sizing are critical.

1. Undersized Compute for Vector Workload

Vector indexes must fit in RAM for optimal performance.

Incorrect:

-- Free tier (0.5GB RAM) with 100K 1536-dim vectors
-- Symptoms: high disk reads, slow queries
select count(*) from documents;  -- Returns 100000

Correct:

-- Check buffer cache hit ratio
select round(100.0 * heap_blks_hit / nullif(heap_blks_hit + heap_blks_read, 0), 2) as hit_ratio
from pg_statio_user_tables where relname = 'documents';
-- If < 95%, upgrade compute or reduce data

2. Building Index During Peak Traffic

Non-concurrent index builds lock the table.

Incorrect:

-- Locks table, impacts all queries
create index on documents using hnsw (embedding vector_cosine_ops);

Correct:

-- No lock, runs in background
create index concurrently on documents using hnsw (embedding vector_cosine_ops);

Compute Sizing

Approximate capacity for 1536-dimension vectors with HNSW index:

Plan	RAM	Vectors (1536d)
Nano (Free)	0.5GB	Limited — index may swap
Micro	1GB	~15K
Small	2GB	~50K
Medium	4GB	~100K
Large	8GB	~225K

See the compute sizing guide for detailed benchmarks.

Index Pre-Warming

-- Load index into memory before production traffic
select pg_prewarm('documents_embedding_idx');

-- Run 10K-50K warm-up queries before benchmarking

Index Build Settings

set maintenance_work_mem = '4GB';
set max_parallel_maintenance_workers = 4;
set statement_timeout = '0';

Query Monitoring

-- Find slow vector queries
select substring(query, 1, 80), calls, round(mean_exec_time::numeric, 2) as avg_ms
from pg_stat_statements
where query like '%<=>%' or query like '%<#>%'
order by total_exec_time desc limit 10;

index-hnsw.md - HNSW parameters
Docs - Production guide
Docs - Compute sizing

2.4 KiB Raw Blame History