Files
supabase-postgres-best-prac…/skills/supabase/references/vectors-perf-tuning.md
2026-02-09 19:28:53 +00:00

2.4 KiB

title, impact, impactDescription, tags
title impact impactDescription tags
Optimize Vector Search Performance CRITICAL 2-10x latency reduction with proper tuning performance, pre-warming, compute, batch, monitoring

Optimize Vector Search Performance

Vector search is RAM-bound. Proper tuning and compute sizing are critical.

1. Undersized Compute for Vector Workload

Vector indexes must fit in RAM for optimal performance.

Incorrect:

-- Free tier (0.5GB RAM) with 100K 1536-dim vectors
-- Symptoms: high disk reads, slow queries
select count(*) from documents;  -- Returns 100000

Correct:

-- Check buffer cache hit ratio
select round(100.0 * heap_blks_hit / nullif(heap_blks_hit + heap_blks_read, 0), 2) as hit_ratio
from pg_statio_user_tables where relname = 'documents';
-- If < 95%, upgrade compute or reduce data

2. Building Index During Peak Traffic

Non-concurrent index builds lock the table.

Incorrect:

-- Locks table, impacts all queries
create index on documents using hnsw (embedding vector_cosine_ops);

Correct:

-- No lock, runs in background
create index concurrently on documents using hnsw (embedding vector_cosine_ops);

Compute Sizing

Approximate capacity for 1536-dimension vectors with HNSW index:

Plan RAM Vectors (1536d)
Nano (Free) 0.5GB Limited — index may swap
Micro 1GB ~15K
Small 2GB ~50K
Medium 4GB ~100K
Large 8GB ~225K

See the compute sizing guide for detailed benchmarks.

Index Pre-Warming

-- Load index into memory before production traffic
select pg_prewarm('documents_embedding_idx');

-- Run 10K-50K warm-up queries before benchmarking

Index Build Settings

set maintenance_work_mem = '4GB';
set max_parallel_maintenance_workers = 4;
set statement_timeout = '0';

Query Monitoring

-- Find slow vector queries
select substring(query, 1, 80), calls, round(mean_exec_time::numeric, 2) as avg_ms
from pg_stat_statements
where query like '%<=>%' or query like '%<#>%'
order by total_exec_time desc limit 10;