mirror of
https://github.com/supabase/agent-skills.git
synced 2026-01-26 19:09:51 +08:00
feat: Add initial PostgreSQL best practices rules (#1)
* Add 30 PostgreSQL best practices rules Rules organized in 8 categories: - Query Performance (5): indexes, partial indexes, composite, covering, types - Connection Management (4): pooling, limits, idle timeout, prepared statements - Schema Design (4): data types, primary keys, foreign key indexes, partitioning - Concurrency & Locking (4): short transactions, SKIP LOCKED, advisory, deadlocks - Security (3): RLS basics, RLS performance, privileges - Data Access Patterns (4): N+1 queries, pagination, upsert, batch inserts - Monitoring (3): EXPLAIN ANALYZE, pg_stat_statements, VACUUM/ANALYZE - Advanced Features (3): JSONB indexing, full-text search, CTE materialization Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Update skills/postgresql-best-practices/rules/schema-primary-keys.md Co-authored-by: samrose <samuel.rose@gmail.com> * Update skills/postgresql-best-practices/rules/lock-deadlock-prevention.md Co-authored-by: samrose <samuel.rose@gmail.com> * resolve merge conflicts from postgres team suggestions * Delete GETTING_STARTED.md * Restore all 30 rule files that were lost during rebase * update agents.md * remove postgres 11 mention to advanced cte optimization * update agents.md * replace advanced cte with check contraints * replace check contraints with schema lowercase identifiers --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> Co-authored-by: samrose <samuel.rose@gmail.com>
This commit is contained in:
@@ -1,85 +0,0 @@
|
||||
# Getting Started - Postgres Team
|
||||
|
||||
Quick guide to start adding PostgreSQL best practice rules.
|
||||
|
||||
## Setup
|
||||
|
||||
```bash
|
||||
cd packages/postgresql-best-practices-build
|
||||
npm install
|
||||
```
|
||||
|
||||
## Add a Rule
|
||||
|
||||
1. Copy template:
|
||||
```bash
|
||||
cp skills/postgresql-best-practices/rules/_template.md \
|
||||
skills/postgresql-best-practices/rules/query-your-rule.md
|
||||
```
|
||||
|
||||
2. Edit the file with your rule content
|
||||
|
||||
3. Validate & build:
|
||||
```bash
|
||||
cd packages/postgresql-best-practices-build
|
||||
npm run validate
|
||||
npm run build
|
||||
```
|
||||
|
||||
4. Check `skills/postgresql-best-practices/AGENTS.md` for output
|
||||
|
||||
## File Prefixes → Sections
|
||||
|
||||
| Prefix | Section |
|
||||
|--------|---------|
|
||||
| `query-` | 1. Query Performance (CRITICAL) |
|
||||
| `conn-` | 2. Connection Management (CRITICAL) |
|
||||
| `schema-` | 3. Schema Design (HIGH) |
|
||||
| `lock-` | 4. Concurrency & Locking (MEDIUM-HIGH) |
|
||||
| `security-` | 5. Security & RLS (MEDIUM-HIGH) |
|
||||
| `data-` | 6. Data Access Patterns (MEDIUM) |
|
||||
| `monitor-` | 7. Monitoring & Diagnostics (LOW-MEDIUM) |
|
||||
| `advanced-` | 8. Advanced Features (LOW) |
|
||||
|
||||
## Rule Structure
|
||||
|
||||
```markdown
|
||||
---
|
||||
title: Action-Oriented Title
|
||||
impact: CRITICAL|HIGH|MEDIUM-HIGH|MEDIUM|LOW-MEDIUM|LOW
|
||||
impactDescription: 10x faster queries
|
||||
tags: indexes, performance
|
||||
---
|
||||
|
||||
## Title
|
||||
|
||||
Brief explanation.
|
||||
|
||||
**Incorrect (why it's bad):**
|
||||
```sql
|
||||
-- Bad pattern
|
||||
```
|
||||
|
||||
**Correct (why it's better):**
|
||||
```sql
|
||||
-- Good pattern
|
||||
```
|
||||
|
||||
**Supabase Note:** Optional platform guidance.
|
||||
|
||||
Reference: [Link](url)
|
||||
```
|
||||
|
||||
## Key Files
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `rules/_template.md` | Copy this to create new rules |
|
||||
| `rules/_contributing.md` | Writing guidelines |
|
||||
| `rules/_sections.md` | Section definitions (editable) |
|
||||
| `AGENTS.md` | Generated output (don't edit directly) |
|
||||
|
||||
## Questions?
|
||||
|
||||
- Writing guidelines: `rules/_contributing.md`
|
||||
- Full contributor guide: `skills/postgresql-best-practices/README.md`
|
||||
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,55 @@
|
||||
---
|
||||
title: Use tsvector for Full-Text Search
|
||||
impact: MEDIUM
|
||||
impactDescription: 100x faster than LIKE, with ranking support
|
||||
tags: full-text-search, tsvector, gin, search
|
||||
---
|
||||
|
||||
## Use tsvector for Full-Text Search
|
||||
|
||||
LIKE with wildcards can't use indexes. Full-text search with tsvector is orders of magnitude faster.
|
||||
|
||||
**Incorrect (LIKE pattern matching):**
|
||||
|
||||
```sql
|
||||
-- Cannot use index, scans all rows
|
||||
select * from articles where content like '%postgresql%';
|
||||
|
||||
-- Case-insensitive makes it worse
|
||||
select * from articles where lower(content) like '%postgresql%';
|
||||
```
|
||||
|
||||
**Correct (full-text search with tsvector):**
|
||||
|
||||
```sql
|
||||
-- Add tsvector column and index
|
||||
alter table articles add column search_vector tsvector
|
||||
generated always as (to_tsvector('english', coalesce(title,'') || ' ' || coalesce(content,''))) stored;
|
||||
|
||||
create index articles_search_idx on articles using gin (search_vector);
|
||||
|
||||
-- Fast full-text search
|
||||
select * from articles
|
||||
where search_vector @@ to_tsquery('english', 'postgresql & performance');
|
||||
|
||||
-- With ranking
|
||||
select *, ts_rank(search_vector, query) as rank
|
||||
from articles, to_tsquery('english', 'postgresql') query
|
||||
where search_vector @@ query
|
||||
order by rank desc;
|
||||
```
|
||||
|
||||
Search multiple terms:
|
||||
|
||||
```sql
|
||||
-- AND: both terms required
|
||||
to_tsquery('postgresql & performance')
|
||||
|
||||
-- OR: either term
|
||||
to_tsquery('postgresql | mysql')
|
||||
|
||||
-- Prefix matching
|
||||
to_tsquery('post:*')
|
||||
```
|
||||
|
||||
Reference: [Full Text Search](https://supabase.com/docs/guides/database/full-text-search)
|
||||
@@ -0,0 +1,49 @@
|
||||
---
|
||||
title: Index JSONB Columns for Efficient Querying
|
||||
impact: MEDIUM
|
||||
impactDescription: 10-100x faster JSONB queries with proper indexing
|
||||
tags: jsonb, gin, indexes, json
|
||||
---
|
||||
|
||||
## Index JSONB Columns for Efficient Querying
|
||||
|
||||
JSONB queries without indexes scan the entire table. Use GIN indexes for containment queries.
|
||||
|
||||
**Incorrect (no index on JSONB):**
|
||||
|
||||
```sql
|
||||
create table products (
|
||||
id bigint primary key,
|
||||
attributes jsonb
|
||||
);
|
||||
|
||||
-- Full table scan for every query
|
||||
select * from products where attributes @> '{"color": "red"}';
|
||||
select * from products where attributes->>'brand' = 'Nike';
|
||||
```
|
||||
|
||||
**Correct (GIN index for JSONB):**
|
||||
|
||||
```sql
|
||||
-- GIN index for containment operators (@>, ?, ?&, ?|)
|
||||
create index products_attrs_gin on products using gin (attributes);
|
||||
|
||||
-- Now containment queries use the index
|
||||
select * from products where attributes @> '{"color": "red"}';
|
||||
|
||||
-- For specific key lookups, use expression index
|
||||
create index products_brand_idx on products ((attributes->>'brand'));
|
||||
select * from products where attributes->>'brand' = 'Nike';
|
||||
```
|
||||
|
||||
Choose the right operator class:
|
||||
|
||||
```sql
|
||||
-- jsonb_ops (default): supports all operators, larger index
|
||||
create index idx1 on products using gin (attributes);
|
||||
|
||||
-- jsonb_path_ops: only @> operator, but 2-3x smaller index
|
||||
create index idx2 on products using gin (attributes jsonb_path_ops);
|
||||
```
|
||||
|
||||
Reference: [JSONB Indexes](https://www.postgresql.org/docs/current/datatype-json.html#JSON-INDEXING)
|
||||
46
skills/postgres-best-practices/rules/conn-idle-timeout.md
Normal file
46
skills/postgres-best-practices/rules/conn-idle-timeout.md
Normal file
@@ -0,0 +1,46 @@
|
||||
---
|
||||
title: Configure Idle Connection Timeouts
|
||||
impact: HIGH
|
||||
impactDescription: Reclaim 30-50% of connection slots from idle clients
|
||||
tags: connections, timeout, idle, resource-management
|
||||
---
|
||||
|
||||
## Configure Idle Connection Timeouts
|
||||
|
||||
Idle connections waste resources. Configure timeouts to automatically reclaim them.
|
||||
|
||||
**Incorrect (connections held indefinitely):**
|
||||
|
||||
```sql
|
||||
-- No timeout configured
|
||||
show idle_in_transaction_session_timeout; -- 0 (disabled)
|
||||
|
||||
-- Connections stay open forever, even when idle
|
||||
select pid, state, state_change, query
|
||||
from pg_stat_activity
|
||||
where state = 'idle in transaction';
|
||||
-- Shows transactions idle for hours, holding locks
|
||||
```
|
||||
|
||||
**Correct (automatic cleanup of idle connections):**
|
||||
|
||||
```sql
|
||||
-- Terminate connections idle in transaction after 30 seconds
|
||||
alter system set idle_in_transaction_session_timeout = '30s';
|
||||
|
||||
-- Terminate completely idle connections after 10 minutes
|
||||
alter system set idle_session_timeout = '10min';
|
||||
|
||||
-- Reload configuration
|
||||
select pg_reload_conf();
|
||||
```
|
||||
|
||||
For pooled connections, configure at the pooler level:
|
||||
|
||||
```ini
|
||||
# pgbouncer.ini
|
||||
server_idle_timeout = 60
|
||||
client_idle_timeout = 300
|
||||
```
|
||||
|
||||
Reference: [Connection Timeouts](https://www.postgresql.org/docs/current/runtime-config-client.html#GUC-IDLE-IN-TRANSACTION-SESSION-TIMEOUT)
|
||||
44
skills/postgres-best-practices/rules/conn-limits.md
Normal file
44
skills/postgres-best-practices/rules/conn-limits.md
Normal file
@@ -0,0 +1,44 @@
|
||||
---
|
||||
title: Set Appropriate Connection Limits
|
||||
impact: CRITICAL
|
||||
impactDescription: Prevent database crashes and memory exhaustion
|
||||
tags: connections, max-connections, limits, stability
|
||||
---
|
||||
|
||||
## Set Appropriate Connection Limits
|
||||
|
||||
Too many connections exhaust memory and degrade performance. Set limits based on available resources.
|
||||
|
||||
**Incorrect (unlimited or excessive connections):**
|
||||
|
||||
```sql
|
||||
-- Default max_connections = 100, but often increased blindly
|
||||
show max_connections; -- 500 (way too high for 4GB RAM)
|
||||
|
||||
-- Each connection uses 1-3MB RAM
|
||||
-- 500 connections * 2MB = 1GB just for connections!
|
||||
-- Out of memory errors under load
|
||||
```
|
||||
|
||||
**Correct (calculate based on resources):**
|
||||
|
||||
```sql
|
||||
-- Formula: max_connections = (RAM in MB / 5MB per connection) - reserved
|
||||
-- For 4GB RAM: (4096 / 5) - 10 = ~800 theoretical max
|
||||
-- But practically, 100-200 is better for query performance
|
||||
|
||||
-- Recommended settings for 4GB RAM
|
||||
alter system set max_connections = 100;
|
||||
|
||||
-- Also set work_mem appropriately
|
||||
-- work_mem * max_connections should not exceed 25% of RAM
|
||||
alter system set work_mem = '8MB'; -- 8MB * 100 = 800MB max
|
||||
```
|
||||
|
||||
Monitor connection usage:
|
||||
|
||||
```sql
|
||||
select count(*), state from pg_stat_activity group by state;
|
||||
```
|
||||
|
||||
Reference: [Database Connections](https://supabase.com/docs/guides/platform/performance#connection-management)
|
||||
41
skills/postgres-best-practices/rules/conn-pooling.md
Normal file
41
skills/postgres-best-practices/rules/conn-pooling.md
Normal file
@@ -0,0 +1,41 @@
|
||||
---
|
||||
title: Use Connection Pooling for All Applications
|
||||
impact: CRITICAL
|
||||
impactDescription: Handle 10-100x more concurrent users
|
||||
tags: connection-pooling, pgbouncer, performance, scalability
|
||||
---
|
||||
|
||||
## Use Connection Pooling for All Applications
|
||||
|
||||
Postgres connections are expensive (1-3MB RAM each). Without pooling, applications exhaust connections under load.
|
||||
|
||||
**Incorrect (new connection per request):**
|
||||
|
||||
```sql
|
||||
-- Each request creates a new connection
|
||||
-- Application code: db.connect() per request
|
||||
-- Result: 500 concurrent users = 500 connections = crashed database
|
||||
|
||||
-- Check current connections
|
||||
select count(*) from pg_stat_activity; -- 487 connections!
|
||||
```
|
||||
|
||||
**Correct (connection pooling):**
|
||||
|
||||
```sql
|
||||
-- Use a pooler like PgBouncer between app and database
|
||||
-- Application connects to pooler, pooler reuses a small pool to Postgres
|
||||
|
||||
-- Configure pool_size based on: (CPU cores * 2) + spindle_count
|
||||
-- Example for 4 cores: pool_size = 10
|
||||
|
||||
-- Result: 500 concurrent users share 10 actual connections
|
||||
select count(*) from pg_stat_activity; -- 10 connections
|
||||
```
|
||||
|
||||
Pool modes:
|
||||
|
||||
- **Transaction mode**: connection returned after each transaction (best for most apps)
|
||||
- **Session mode**: connection held for entire session (needed for prepared statements, temp tables)
|
||||
|
||||
Reference: [Connection Pooling](https://supabase.com/docs/guides/database/connecting-to-postgres#connection-pooler)
|
||||
@@ -0,0 +1,46 @@
|
||||
---
|
||||
title: Use Prepared Statements Correctly with Pooling
|
||||
impact: HIGH
|
||||
impactDescription: Avoid prepared statement conflicts in pooled environments
|
||||
tags: prepared-statements, connection-pooling, transaction-mode
|
||||
---
|
||||
|
||||
## Use Prepared Statements Correctly with Pooling
|
||||
|
||||
Prepared statements are tied to individual database connections. In transaction-mode pooling, connections are shared, causing conflicts.
|
||||
|
||||
**Incorrect (named prepared statements with transaction pooling):**
|
||||
|
||||
```sql
|
||||
-- Named prepared statement
|
||||
prepare get_user as select * from users where id = $1;
|
||||
|
||||
-- In transaction mode pooling, next request may get different connection
|
||||
execute get_user(123);
|
||||
-- ERROR: prepared statement "get_user" does not exist
|
||||
```
|
||||
|
||||
**Correct (use unnamed statements or session mode):**
|
||||
|
||||
```sql
|
||||
-- Option 1: Use unnamed prepared statements (most ORMs do this automatically)
|
||||
-- The query is prepared and executed in a single protocol message
|
||||
|
||||
-- Option 2: Deallocate after use in transaction mode
|
||||
prepare get_user as select * from users where id = $1;
|
||||
execute get_user(123);
|
||||
deallocate get_user;
|
||||
|
||||
-- Option 3: Use session mode pooling (port 5432 vs 6543)
|
||||
-- Connection is held for entire session, prepared statements persist
|
||||
```
|
||||
|
||||
Check your driver settings:
|
||||
|
||||
```sql
|
||||
-- Many drivers use prepared statements by default
|
||||
-- Node.js pg: { prepare: false } to disable
|
||||
-- JDBC: prepareThreshold=0 to disable
|
||||
```
|
||||
|
||||
Reference: [Prepared Statements with Pooling](https://supabase.com/docs/guides/database/connecting-to-postgres#connection-pool-modes)
|
||||
54
skills/postgres-best-practices/rules/data-batch-inserts.md
Normal file
54
skills/postgres-best-practices/rules/data-batch-inserts.md
Normal file
@@ -0,0 +1,54 @@
|
||||
---
|
||||
title: Batch INSERT Statements for Bulk Data
|
||||
impact: MEDIUM
|
||||
impactDescription: 10-50x faster bulk inserts
|
||||
tags: batch, insert, bulk, performance, copy
|
||||
---
|
||||
|
||||
## Batch INSERT Statements for Bulk Data
|
||||
|
||||
Individual INSERT statements have high overhead. Batch multiple rows in single statements or use COPY.
|
||||
|
||||
**Incorrect (individual inserts):**
|
||||
|
||||
```sql
|
||||
-- Each insert is a separate transaction and round trip
|
||||
insert into events (user_id, action) values (1, 'click');
|
||||
insert into events (user_id, action) values (1, 'view');
|
||||
insert into events (user_id, action) values (2, 'click');
|
||||
-- ... 1000 more individual inserts
|
||||
|
||||
-- 1000 inserts = 1000 round trips = slow
|
||||
```
|
||||
|
||||
**Correct (batch insert):**
|
||||
|
||||
```sql
|
||||
-- Multiple rows in single statement
|
||||
insert into events (user_id, action) values
|
||||
(1, 'click'),
|
||||
(1, 'view'),
|
||||
(2, 'click'),
|
||||
-- ... up to ~1000 rows per batch
|
||||
(999, 'view');
|
||||
|
||||
-- One round trip for 1000 rows
|
||||
```
|
||||
|
||||
For large imports, use COPY:
|
||||
|
||||
```sql
|
||||
-- COPY is fastest for bulk loading
|
||||
copy events (user_id, action, created_at)
|
||||
from '/path/to/data.csv'
|
||||
with (format csv, header true);
|
||||
|
||||
-- Or from stdin in application
|
||||
copy events (user_id, action) from stdin with (format csv);
|
||||
1,click
|
||||
1,view
|
||||
2,click
|
||||
\.
|
||||
```
|
||||
|
||||
Reference: [COPY](https://www.postgresql.org/docs/current/sql-copy.html)
|
||||
53
skills/postgres-best-practices/rules/data-n-plus-one.md
Normal file
53
skills/postgres-best-practices/rules/data-n-plus-one.md
Normal file
@@ -0,0 +1,53 @@
|
||||
---
|
||||
title: Eliminate N+1 Queries with Batch Loading
|
||||
impact: MEDIUM-HIGH
|
||||
impactDescription: 10-100x fewer database round trips
|
||||
tags: n-plus-one, batch, performance, queries
|
||||
---
|
||||
|
||||
## Eliminate N+1 Queries with Batch Loading
|
||||
|
||||
N+1 queries execute one query per item in a loop. Batch them into a single query using arrays or JOINs.
|
||||
|
||||
**Incorrect (N+1 queries):**
|
||||
|
||||
```sql
|
||||
-- First query: get all users
|
||||
select id from users where active = true; -- Returns 100 IDs
|
||||
|
||||
-- Then N queries, one per user
|
||||
select * from orders where user_id = 1;
|
||||
select * from orders where user_id = 2;
|
||||
select * from orders where user_id = 3;
|
||||
-- ... 97 more queries!
|
||||
|
||||
-- Total: 101 round trips to database
|
||||
```
|
||||
|
||||
**Correct (single batch query):**
|
||||
|
||||
```sql
|
||||
-- Collect IDs and query once with ANY
|
||||
select * from orders where user_id = any(array[1, 2, 3, ...]);
|
||||
|
||||
-- Or use JOIN instead of loop
|
||||
select u.id, u.name, o.*
|
||||
from users u
|
||||
left join orders o on o.user_id = u.id
|
||||
where u.active = true;
|
||||
|
||||
-- Total: 1 round trip
|
||||
```
|
||||
|
||||
Application pattern:
|
||||
|
||||
```sql
|
||||
-- Instead of looping in application code:
|
||||
-- for user in users: db.query("SELECT * FROM orders WHERE user_id = $1", user.id)
|
||||
|
||||
-- Pass array parameter:
|
||||
select * from orders where user_id = any($1::bigint[]);
|
||||
-- Application passes: [1, 2, 3, 4, 5, ...]
|
||||
```
|
||||
|
||||
Reference: [N+1 Query Problem](https://supabase.com/docs/guides/database/query-optimization)
|
||||
50
skills/postgres-best-practices/rules/data-pagination.md
Normal file
50
skills/postgres-best-practices/rules/data-pagination.md
Normal file
@@ -0,0 +1,50 @@
|
||||
---
|
||||
title: Use Cursor-Based Pagination Instead of OFFSET
|
||||
impact: MEDIUM-HIGH
|
||||
impactDescription: Consistent O(1) performance regardless of page depth
|
||||
tags: pagination, cursor, keyset, offset, performance
|
||||
---
|
||||
|
||||
## Use Cursor-Based Pagination Instead of OFFSET
|
||||
|
||||
OFFSET-based pagination scans all skipped rows, getting slower on deeper pages. Cursor pagination is O(1).
|
||||
|
||||
**Incorrect (OFFSET pagination):**
|
||||
|
||||
```sql
|
||||
-- Page 1: scans 20 rows
|
||||
select * from products order by id limit 20 offset 0;
|
||||
|
||||
-- Page 100: scans 2000 rows to skip 1980
|
||||
select * from products order by id limit 20 offset 1980;
|
||||
|
||||
-- Page 10000: scans 200,000 rows!
|
||||
select * from products order by id limit 20 offset 199980;
|
||||
```
|
||||
|
||||
**Correct (cursor/keyset pagination):**
|
||||
|
||||
```sql
|
||||
-- Page 1: get first 20
|
||||
select * from products order by id limit 20;
|
||||
-- Application stores last_id = 20
|
||||
|
||||
-- Page 2: start after last ID
|
||||
select * from products where id > 20 order by id limit 20;
|
||||
-- Uses index, always fast regardless of page depth
|
||||
|
||||
-- Page 10000: same speed as page 1
|
||||
select * from products where id > 199980 order by id limit 20;
|
||||
```
|
||||
|
||||
For multi-column sorting:
|
||||
|
||||
```sql
|
||||
-- Cursor must include all sort columns
|
||||
select * from products
|
||||
where (created_at, id) > ('2024-01-15 10:00:00', 12345)
|
||||
order by created_at, id
|
||||
limit 20;
|
||||
```
|
||||
|
||||
Reference: [Pagination](https://supabase.com/docs/guides/database/pagination)
|
||||
50
skills/postgres-best-practices/rules/data-upsert.md
Normal file
50
skills/postgres-best-practices/rules/data-upsert.md
Normal file
@@ -0,0 +1,50 @@
|
||||
---
|
||||
title: Use UPSERT for Insert-or-Update Operations
|
||||
impact: MEDIUM
|
||||
impactDescription: Atomic operation, eliminates race conditions
|
||||
tags: upsert, on-conflict, insert, update
|
||||
---
|
||||
|
||||
## Use UPSERT for Insert-or-Update Operations
|
||||
|
||||
Using separate SELECT-then-INSERT/UPDATE creates race conditions. Use INSERT ... ON CONFLICT for atomic upserts.
|
||||
|
||||
**Incorrect (check-then-insert race condition):**
|
||||
|
||||
```sql
|
||||
-- Race condition: two requests check simultaneously
|
||||
select * from settings where user_id = 123 and key = 'theme';
|
||||
-- Both find nothing
|
||||
|
||||
-- Both try to insert
|
||||
insert into settings (user_id, key, value) values (123, 'theme', 'dark');
|
||||
-- One succeeds, one fails with duplicate key error!
|
||||
```
|
||||
|
||||
**Correct (atomic UPSERT):**
|
||||
|
||||
```sql
|
||||
-- Single atomic operation
|
||||
insert into settings (user_id, key, value)
|
||||
values (123, 'theme', 'dark')
|
||||
on conflict (user_id, key)
|
||||
do update set value = excluded.value, updated_at = now();
|
||||
|
||||
-- Returns the inserted/updated row
|
||||
insert into settings (user_id, key, value)
|
||||
values (123, 'theme', 'dark')
|
||||
on conflict (user_id, key)
|
||||
do update set value = excluded.value
|
||||
returning *;
|
||||
```
|
||||
|
||||
Insert-or-ignore pattern:
|
||||
|
||||
```sql
|
||||
-- Insert only if not exists (no update)
|
||||
insert into page_views (page_id, user_id)
|
||||
values (1, 123)
|
||||
on conflict (page_id, user_id) do nothing;
|
||||
```
|
||||
|
||||
Reference: [INSERT ON CONFLICT](https://www.postgresql.org/docs/current/sql-insert.html#SQL-ON-CONFLICT)
|
||||
56
skills/postgres-best-practices/rules/lock-advisory.md
Normal file
56
skills/postgres-best-practices/rules/lock-advisory.md
Normal file
@@ -0,0 +1,56 @@
|
||||
---
|
||||
title: Use Advisory Locks for Application-Level Locking
|
||||
impact: MEDIUM
|
||||
impactDescription: Efficient coordination without row-level lock overhead
|
||||
tags: advisory-locks, coordination, application-locks
|
||||
---
|
||||
|
||||
## Use Advisory Locks for Application-Level Locking
|
||||
|
||||
Advisory locks provide application-level coordination without requiring database rows to lock.
|
||||
|
||||
**Incorrect (creating rows just for locking):**
|
||||
|
||||
```sql
|
||||
-- Creating dummy rows to lock on
|
||||
create table resource_locks (
|
||||
resource_name text primary key
|
||||
);
|
||||
|
||||
insert into resource_locks values ('report_generator');
|
||||
|
||||
-- Lock by selecting the row
|
||||
select * from resource_locks where resource_name = 'report_generator' for update;
|
||||
```
|
||||
|
||||
**Correct (advisory locks):**
|
||||
|
||||
```sql
|
||||
-- Session-level advisory lock (released on disconnect or unlock)
|
||||
select pg_advisory_lock(hashtext('report_generator'));
|
||||
-- ... do exclusive work ...
|
||||
select pg_advisory_unlock(hashtext('report_generator'));
|
||||
|
||||
-- Transaction-level lock (released on commit/rollback)
|
||||
begin;
|
||||
select pg_advisory_xact_lock(hashtext('daily_report'));
|
||||
-- ... do work ...
|
||||
commit; -- Lock automatically released
|
||||
```
|
||||
|
||||
Try-lock for non-blocking operations:
|
||||
|
||||
```sql
|
||||
-- Returns immediately with true/false instead of waiting
|
||||
select pg_try_advisory_lock(hashtext('resource_name'));
|
||||
|
||||
-- Use in application
|
||||
if (acquired) {
|
||||
-- Do work
|
||||
select pg_advisory_unlock(hashtext('resource_name'));
|
||||
} else {
|
||||
-- Skip or retry later
|
||||
}
|
||||
```
|
||||
|
||||
Reference: [Advisory Locks](https://www.postgresql.org/docs/current/explicit-locking.html#ADVISORY-LOCKS)
|
||||
@@ -0,0 +1,68 @@
|
||||
---
|
||||
title: Prevent Deadlocks with Consistent Lock Ordering
|
||||
impact: MEDIUM-HIGH
|
||||
impactDescription: Eliminate deadlock errors, improve reliability
|
||||
tags: deadlocks, locking, transactions, ordering
|
||||
---
|
||||
|
||||
## Prevent Deadlocks with Consistent Lock Ordering
|
||||
|
||||
Deadlocks occur when transactions lock resources in different orders. Always
|
||||
acquire locks in a consistent order.
|
||||
|
||||
**Incorrect (inconsistent lock ordering):**
|
||||
|
||||
```sql
|
||||
-- Transaction A -- Transaction B
|
||||
begin; begin;
|
||||
update accounts update accounts
|
||||
set balance = balance - 100 set balance = balance - 50
|
||||
where id = 1; where id = 2; -- B locks row 2
|
||||
|
||||
update accounts update accounts
|
||||
set balance = balance + 100 set balance = balance + 50
|
||||
where id = 2; -- A waits for B where id = 1; -- B waits for A
|
||||
|
||||
-- DEADLOCK! Both waiting for each other
|
||||
```
|
||||
|
||||
**Correct (lock rows in consistent order first):**
|
||||
|
||||
```sql
|
||||
-- Explicitly acquire locks in ID order before updating
|
||||
begin;
|
||||
select * from accounts where id in (1, 2) order by id for update;
|
||||
|
||||
-- Now perform updates in any order - locks already held
|
||||
update accounts set balance = balance - 100 where id = 1;
|
||||
update accounts set balance = balance + 100 where id = 2;
|
||||
commit;
|
||||
```
|
||||
|
||||
Alternative: use a single statement to update atomically:
|
||||
|
||||
```sql
|
||||
-- Single statement acquires all locks atomically
|
||||
begin;
|
||||
update accounts
|
||||
set balance = balance + case id
|
||||
when 1 then -100
|
||||
when 2 then 100
|
||||
end
|
||||
where id in (1, 2);
|
||||
commit;
|
||||
```
|
||||
|
||||
Detect deadlocks in logs:
|
||||
|
||||
```sql
|
||||
-- Check for recent deadlocks
|
||||
select * from pg_stat_database where deadlocks > 0;
|
||||
|
||||
-- Enable deadlock logging
|
||||
set log_lock_waits = on;
|
||||
set deadlock_timeout = '1s';
|
||||
```
|
||||
|
||||
Reference:
|
||||
[Deadlocks](https://www.postgresql.org/docs/current/explicit-locking.html#LOCKING-DEADLOCKS)
|
||||
@@ -0,0 +1,50 @@
|
||||
---
|
||||
title: Keep Transactions Short to Reduce Lock Contention
|
||||
impact: MEDIUM-HIGH
|
||||
impactDescription: 3-5x throughput improvement, fewer deadlocks
|
||||
tags: transactions, locking, contention, performance
|
||||
---
|
||||
|
||||
## Keep Transactions Short to Reduce Lock Contention
|
||||
|
||||
Long-running transactions hold locks that block other queries. Keep transactions as short as possible.
|
||||
|
||||
**Incorrect (long transaction with external calls):**
|
||||
|
||||
```sql
|
||||
begin;
|
||||
select * from orders where id = 1 for update; -- Lock acquired
|
||||
|
||||
-- Application makes HTTP call to payment API (2-5 seconds)
|
||||
-- Other queries on this row are blocked!
|
||||
|
||||
update orders set status = 'paid' where id = 1;
|
||||
commit; -- Lock held for entire duration
|
||||
```
|
||||
|
||||
**Correct (minimal transaction scope):**
|
||||
|
||||
```sql
|
||||
-- Validate data and call APIs outside transaction
|
||||
-- Application: response = await paymentAPI.charge(...)
|
||||
|
||||
-- Only hold lock for the actual update
|
||||
begin;
|
||||
update orders
|
||||
set status = 'paid', payment_id = $1
|
||||
where id = $2 and status = 'pending'
|
||||
returning *;
|
||||
commit; -- Lock held for milliseconds
|
||||
```
|
||||
|
||||
Use `statement_timeout` to prevent runaway transactions:
|
||||
|
||||
```sql
|
||||
-- Abort queries running longer than 30 seconds
|
||||
set statement_timeout = '30s';
|
||||
|
||||
-- Or per-session
|
||||
set local statement_timeout = '5s';
|
||||
```
|
||||
|
||||
Reference: [Transaction Management](https://www.postgresql.org/docs/current/tutorial-transactions.html)
|
||||
54
skills/postgres-best-practices/rules/lock-skip-locked.md
Normal file
54
skills/postgres-best-practices/rules/lock-skip-locked.md
Normal file
@@ -0,0 +1,54 @@
|
||||
---
|
||||
title: Use SKIP LOCKED for Non-Blocking Queue Processing
|
||||
impact: MEDIUM-HIGH
|
||||
impactDescription: 10x throughput for worker queues
|
||||
tags: skip-locked, queue, workers, concurrency
|
||||
---
|
||||
|
||||
## Use SKIP LOCKED for Non-Blocking Queue Processing
|
||||
|
||||
When multiple workers process a queue, SKIP LOCKED allows workers to process different rows without waiting.
|
||||
|
||||
**Incorrect (workers block each other):**
|
||||
|
||||
```sql
|
||||
-- Worker 1 and Worker 2 both try to get next job
|
||||
begin;
|
||||
select * from jobs where status = 'pending' order by created_at limit 1 for update;
|
||||
-- Worker 2 waits for Worker 1's lock to release!
|
||||
```
|
||||
|
||||
**Correct (SKIP LOCKED for parallel processing):**
|
||||
|
||||
```sql
|
||||
-- Each worker skips locked rows and gets the next available
|
||||
begin;
|
||||
select * from jobs
|
||||
where status = 'pending'
|
||||
order by created_at
|
||||
limit 1
|
||||
for update skip locked;
|
||||
|
||||
-- Worker 1 gets job 1, Worker 2 gets job 2 (no waiting)
|
||||
|
||||
update jobs set status = 'processing' where id = $1;
|
||||
commit;
|
||||
```
|
||||
|
||||
Complete queue pattern:
|
||||
|
||||
```sql
|
||||
-- Atomic claim-and-update in one statement
|
||||
update jobs
|
||||
set status = 'processing', worker_id = $1, started_at = now()
|
||||
where id = (
|
||||
select id from jobs
|
||||
where status = 'pending'
|
||||
order by created_at
|
||||
limit 1
|
||||
for update skip locked
|
||||
)
|
||||
returning *;
|
||||
```
|
||||
|
||||
Reference: [SELECT FOR UPDATE SKIP LOCKED](https://www.postgresql.org/docs/current/sql-select.html#SQL-FOR-UPDATE-SHARE)
|
||||
@@ -0,0 +1,45 @@
|
||||
---
|
||||
title: Use EXPLAIN ANALYZE to Diagnose Slow Queries
|
||||
impact: LOW-MEDIUM
|
||||
impactDescription: Identify exact bottlenecks in query execution
|
||||
tags: explain, analyze, diagnostics, query-plan
|
||||
---
|
||||
|
||||
## Use EXPLAIN ANALYZE to Diagnose Slow Queries
|
||||
|
||||
EXPLAIN ANALYZE executes the query and shows actual timings, revealing the true performance bottlenecks.
|
||||
|
||||
**Incorrect (guessing at performance issues):**
|
||||
|
||||
```sql
|
||||
-- Query is slow, but why?
|
||||
select * from orders where customer_id = 123 and status = 'pending';
|
||||
-- "It must be missing an index" - but which one?
|
||||
```
|
||||
|
||||
**Correct (use EXPLAIN ANALYZE):**
|
||||
|
||||
```sql
|
||||
explain (analyze, buffers, format text)
|
||||
select * from orders where customer_id = 123 and status = 'pending';
|
||||
|
||||
-- Output reveals the issue:
|
||||
-- Seq Scan on orders (cost=0.00..25000.00 rows=50 width=100) (actual time=0.015..450.123 rows=50 loops=1)
|
||||
-- Filter: ((customer_id = 123) AND (status = 'pending'::text))
|
||||
-- Rows Removed by Filter: 999950
|
||||
-- Buffers: shared hit=5000 read=15000
|
||||
-- Planning Time: 0.150 ms
|
||||
-- Execution Time: 450.500 ms
|
||||
```
|
||||
|
||||
Key things to look for:
|
||||
|
||||
```sql
|
||||
-- Seq Scan on large tables = missing index
|
||||
-- Rows Removed by Filter = poor selectivity or missing index
|
||||
-- Buffers: read >> hit = data not cached, needs more memory
|
||||
-- Nested Loop with high loops = consider different join strategy
|
||||
-- Sort Method: external merge = work_mem too low
|
||||
```
|
||||
|
||||
Reference: [EXPLAIN](https://supabase.com/docs/guides/database/inspect)
|
||||
@@ -0,0 +1,55 @@
|
||||
---
|
||||
title: Enable pg_stat_statements for Query Analysis
|
||||
impact: LOW-MEDIUM
|
||||
impactDescription: Identify top resource-consuming queries
|
||||
tags: pg-stat-statements, monitoring, statistics, performance
|
||||
---
|
||||
|
||||
## Enable pg_stat_statements for Query Analysis
|
||||
|
||||
pg_stat_statements tracks execution statistics for all queries, helping identify slow and frequent queries.
|
||||
|
||||
**Incorrect (no visibility into query patterns):**
|
||||
|
||||
```sql
|
||||
-- Database is slow, but which queries are the problem?
|
||||
-- No way to know without pg_stat_statements
|
||||
```
|
||||
|
||||
**Correct (enable and query pg_stat_statements):**
|
||||
|
||||
```sql
|
||||
-- Enable the extension
|
||||
create extension if not exists pg_stat_statements;
|
||||
|
||||
-- Find slowest queries by total time
|
||||
select
|
||||
calls,
|
||||
round(total_exec_time::numeric, 2) as total_time_ms,
|
||||
round(mean_exec_time::numeric, 2) as mean_time_ms,
|
||||
query
|
||||
from pg_stat_statements
|
||||
order by total_exec_time desc
|
||||
limit 10;
|
||||
|
||||
-- Find most frequent queries
|
||||
select calls, query
|
||||
from pg_stat_statements
|
||||
order by calls desc
|
||||
limit 10;
|
||||
|
||||
-- Reset statistics after optimization
|
||||
select pg_stat_statements_reset();
|
||||
```
|
||||
|
||||
Key metrics to monitor:
|
||||
|
||||
```sql
|
||||
-- Queries with high mean time (candidates for optimization)
|
||||
select query, mean_exec_time, calls
|
||||
from pg_stat_statements
|
||||
where mean_exec_time > 100 -- > 100ms average
|
||||
order by mean_exec_time desc;
|
||||
```
|
||||
|
||||
Reference: [pg_stat_statements](https://supabase.com/docs/guides/database/extensions/pg_stat_statements)
|
||||
@@ -0,0 +1,55 @@
|
||||
---
|
||||
title: Maintain Table Statistics with VACUUM and ANALYZE
|
||||
impact: MEDIUM
|
||||
impactDescription: 2-10x better query plans with accurate statistics
|
||||
tags: vacuum, analyze, statistics, maintenance, autovacuum
|
||||
---
|
||||
|
||||
## Maintain Table Statistics with VACUUM and ANALYZE
|
||||
|
||||
Outdated statistics cause the query planner to make poor decisions. VACUUM reclaims space, ANALYZE updates statistics.
|
||||
|
||||
**Incorrect (stale statistics):**
|
||||
|
||||
```sql
|
||||
-- Table has 1M rows but stats say 1000
|
||||
-- Query planner chooses wrong strategy
|
||||
explain select * from orders where status = 'pending';
|
||||
-- Shows: Seq Scan (because stats show small table)
|
||||
-- Actually: Index Scan would be much faster
|
||||
```
|
||||
|
||||
**Correct (maintain fresh statistics):**
|
||||
|
||||
```sql
|
||||
-- Manually analyze after large data changes
|
||||
analyze orders;
|
||||
|
||||
-- Analyze specific columns used in WHERE clauses
|
||||
analyze orders (status, created_at);
|
||||
|
||||
-- Check when tables were last analyzed
|
||||
select
|
||||
relname,
|
||||
last_vacuum,
|
||||
last_autovacuum,
|
||||
last_analyze,
|
||||
last_autoanalyze
|
||||
from pg_stat_user_tables
|
||||
order by last_analyze nulls first;
|
||||
```
|
||||
|
||||
Autovacuum tuning for busy tables:
|
||||
|
||||
```sql
|
||||
-- Increase frequency for high-churn tables
|
||||
alter table orders set (
|
||||
autovacuum_vacuum_scale_factor = 0.05, -- Vacuum at 5% dead tuples (default 20%)
|
||||
autovacuum_analyze_scale_factor = 0.02 -- Analyze at 2% changes (default 10%)
|
||||
);
|
||||
|
||||
-- Check autovacuum status
|
||||
select * from pg_stat_progress_vacuum;
|
||||
```
|
||||
|
||||
Reference: [VACUUM](https://supabase.com/docs/guides/database/database-size#vacuum-operations)
|
||||
@@ -0,0 +1,44 @@
|
||||
---
|
||||
title: Create Composite Indexes for Multi-Column Queries
|
||||
impact: HIGH
|
||||
impactDescription: 5-10x faster multi-column queries
|
||||
tags: indexes, composite-index, multi-column, query-optimization
|
||||
---
|
||||
|
||||
## Create Composite Indexes for Multi-Column Queries
|
||||
|
||||
When queries filter on multiple columns, a composite index is more efficient than separate single-column indexes.
|
||||
|
||||
**Incorrect (separate indexes require bitmap scan):**
|
||||
|
||||
```sql
|
||||
-- Two separate indexes
|
||||
create index orders_status_idx on orders (status);
|
||||
create index orders_created_idx on orders (created_at);
|
||||
|
||||
-- Query must combine both indexes (slower)
|
||||
select * from orders where status = 'pending' and created_at > '2024-01-01';
|
||||
```
|
||||
|
||||
**Correct (composite index):**
|
||||
|
||||
```sql
|
||||
-- Single composite index (leftmost column first for equality checks)
|
||||
create index orders_status_created_idx on orders (status, created_at);
|
||||
|
||||
-- Query uses one efficient index scan
|
||||
select * from orders where status = 'pending' and created_at > '2024-01-01';
|
||||
```
|
||||
|
||||
**Column order matters** - place equality columns first, range columns last:
|
||||
|
||||
```sql
|
||||
-- Good: status (=) before created_at (>)
|
||||
create index idx on orders (status, created_at);
|
||||
|
||||
-- Works for: WHERE status = 'pending'
|
||||
-- Works for: WHERE status = 'pending' AND created_at > '2024-01-01'
|
||||
-- Does NOT work for: WHERE created_at > '2024-01-01' (leftmost prefix rule)
|
||||
```
|
||||
|
||||
Reference: [Multicolumn Indexes](https://www.postgresql.org/docs/current/indexes-multicolumn.html)
|
||||
@@ -0,0 +1,40 @@
|
||||
---
|
||||
title: Use Covering Indexes to Avoid Table Lookups
|
||||
impact: MEDIUM-HIGH
|
||||
impactDescription: 2-5x faster queries by eliminating heap fetches
|
||||
tags: indexes, covering-index, include, index-only-scan
|
||||
---
|
||||
|
||||
## Use Covering Indexes to Avoid Table Lookups
|
||||
|
||||
Covering indexes include all columns needed by a query, enabling index-only scans that skip the table entirely.
|
||||
|
||||
**Incorrect (index scan + heap fetch):**
|
||||
|
||||
```sql
|
||||
create index users_email_idx on users (email);
|
||||
|
||||
-- Must fetch name and created_at from table heap
|
||||
select email, name, created_at from users where email = 'user@example.com';
|
||||
```
|
||||
|
||||
**Correct (index-only scan with INCLUDE):**
|
||||
|
||||
```sql
|
||||
-- Include non-searchable columns in the index
|
||||
create index users_email_idx on users (email) include (name, created_at);
|
||||
|
||||
-- All columns served from index, no table access needed
|
||||
select email, name, created_at from users where email = 'user@example.com';
|
||||
```
|
||||
|
||||
Use INCLUDE for columns you SELECT but don't filter on:
|
||||
|
||||
```sql
|
||||
-- Searching by status, but also need customer_id and total
|
||||
create index orders_status_idx on orders (status) include (customer_id, total);
|
||||
|
||||
select status, customer_id, total from orders where status = 'shipped';
|
||||
```
|
||||
|
||||
Reference: [Index-Only Scans](https://www.postgresql.org/docs/current/indexes-index-only-scans.html)
|
||||
45
skills/postgres-best-practices/rules/query-index-types.md
Normal file
45
skills/postgres-best-practices/rules/query-index-types.md
Normal file
@@ -0,0 +1,45 @@
|
||||
---
|
||||
title: Choose the Right Index Type for Your Data
|
||||
impact: HIGH
|
||||
impactDescription: 10-100x improvement with correct index type
|
||||
tags: indexes, btree, gin, brin, hash, index-types
|
||||
---
|
||||
|
||||
## Choose the Right Index Type for Your Data
|
||||
|
||||
Different index types excel at different query patterns. The default B-tree isn't always optimal.
|
||||
|
||||
**Incorrect (B-tree for JSONB containment):**
|
||||
|
||||
```sql
|
||||
-- B-tree cannot optimize containment operators
|
||||
create index products_attrs_idx on products (attributes);
|
||||
select * from products where attributes @> '{"color": "red"}';
|
||||
-- Full table scan - B-tree doesn't support @> operator
|
||||
```
|
||||
|
||||
**Correct (GIN for JSONB):**
|
||||
|
||||
```sql
|
||||
-- GIN supports @>, ?, ?&, ?| operators
|
||||
create index products_attrs_idx on products using gin (attributes);
|
||||
select * from products where attributes @> '{"color": "red"}';
|
||||
```
|
||||
|
||||
Index type guide:
|
||||
|
||||
```sql
|
||||
-- B-tree (default): =, <, >, BETWEEN, IN, IS NULL
|
||||
create index users_created_idx on users (created_at);
|
||||
|
||||
-- GIN: arrays, JSONB, full-text search
|
||||
create index posts_tags_idx on posts using gin (tags);
|
||||
|
||||
-- BRIN: large time-series tables (10-100x smaller)
|
||||
create index events_time_idx on events using brin (created_at);
|
||||
|
||||
-- Hash: equality-only (slightly faster than B-tree for =)
|
||||
create index sessions_token_idx on sessions using hash (token);
|
||||
```
|
||||
|
||||
Reference: [Index Types](https://www.postgresql.org/docs/current/indexes-types.html)
|
||||
@@ -0,0 +1,43 @@
|
||||
---
|
||||
title: Add Indexes on WHERE and JOIN Columns
|
||||
impact: CRITICAL
|
||||
impactDescription: 100-1000x faster queries on large tables
|
||||
tags: indexes, performance, sequential-scan, query-optimization
|
||||
---
|
||||
|
||||
## Add Indexes on WHERE and JOIN Columns
|
||||
|
||||
Queries filtering or joining on unindexed columns cause full table scans, which become exponentially slower as tables grow.
|
||||
|
||||
**Incorrect (sequential scan on large table):**
|
||||
|
||||
```sql
|
||||
-- No index on customer_id causes full table scan
|
||||
select * from orders where customer_id = 123;
|
||||
|
||||
-- EXPLAIN shows: Seq Scan on orders (cost=0.00..25000.00 rows=100 width=85)
|
||||
```
|
||||
|
||||
**Correct (index scan):**
|
||||
|
||||
```sql
|
||||
-- Create index on frequently filtered column
|
||||
create index orders_customer_id_idx on orders (customer_id);
|
||||
|
||||
select * from orders where customer_id = 123;
|
||||
|
||||
-- EXPLAIN shows: Index Scan using orders_customer_id_idx (cost=0.42..8.44 rows=100 width=85)
|
||||
```
|
||||
|
||||
For JOIN columns, always index the foreign key side:
|
||||
|
||||
```sql
|
||||
-- Index the referencing column
|
||||
create index orders_customer_id_idx on orders (customer_id);
|
||||
|
||||
select c.name, o.total
|
||||
from customers c
|
||||
join orders o on o.customer_id = c.id;
|
||||
```
|
||||
|
||||
Reference: [Query Optimization](https://supabase.com/docs/guides/database/query-optimization)
|
||||
@@ -0,0 +1,45 @@
|
||||
---
|
||||
title: Use Partial Indexes for Filtered Queries
|
||||
impact: HIGH
|
||||
impactDescription: 5-20x smaller indexes, faster writes and queries
|
||||
tags: indexes, partial-index, query-optimization, storage
|
||||
---
|
||||
|
||||
## Use Partial Indexes for Filtered Queries
|
||||
|
||||
Partial indexes only include rows matching a WHERE condition, making them smaller and faster when queries consistently filter on the same condition.
|
||||
|
||||
**Incorrect (full index includes irrelevant rows):**
|
||||
|
||||
```sql
|
||||
-- Index includes all rows, even soft-deleted ones
|
||||
create index users_email_idx on users (email);
|
||||
|
||||
-- Query always filters active users
|
||||
select * from users where email = 'user@example.com' and deleted_at is null;
|
||||
```
|
||||
|
||||
**Correct (partial index matches query filter):**
|
||||
|
||||
```sql
|
||||
-- Index only includes active users
|
||||
create index users_active_email_idx on users (email)
|
||||
where deleted_at is null;
|
||||
|
||||
-- Query uses the smaller, faster index
|
||||
select * from users where email = 'user@example.com' and deleted_at is null;
|
||||
```
|
||||
|
||||
Common use cases for partial indexes:
|
||||
|
||||
```sql
|
||||
-- Only pending orders (status rarely changes once completed)
|
||||
create index orders_pending_idx on orders (created_at)
|
||||
where status = 'pending';
|
||||
|
||||
-- Only non-null values
|
||||
create index products_sku_idx on products (sku)
|
||||
where sku is not null;
|
||||
```
|
||||
|
||||
Reference: [Partial Indexes](https://www.postgresql.org/docs/current/indexes-partial.html)
|
||||
46
skills/postgres-best-practices/rules/schema-data-types.md
Normal file
46
skills/postgres-best-practices/rules/schema-data-types.md
Normal file
@@ -0,0 +1,46 @@
|
||||
---
|
||||
title: Choose Appropriate Data Types
|
||||
impact: HIGH
|
||||
impactDescription: 50% storage reduction, faster comparisons
|
||||
tags: data-types, schema, storage, performance
|
||||
---
|
||||
|
||||
## Choose Appropriate Data Types
|
||||
|
||||
Using the right data types reduces storage, improves query performance, and prevents bugs.
|
||||
|
||||
**Incorrect (wrong data types):**
|
||||
|
||||
```sql
|
||||
create table users (
|
||||
id int, -- Will overflow at 2.1 billion
|
||||
email varchar(255), -- Unnecessary length limit
|
||||
created_at timestamp, -- Missing timezone info
|
||||
is_active varchar(5), -- String for boolean
|
||||
price varchar(20) -- String for numeric
|
||||
);
|
||||
```
|
||||
|
||||
**Correct (appropriate data types):**
|
||||
|
||||
```sql
|
||||
create table users (
|
||||
id bigint generated always as identity primary key, -- 9 quintillion max
|
||||
email text, -- No artificial limit, same performance as varchar
|
||||
created_at timestamptz, -- Always store timezone-aware timestamps
|
||||
is_active boolean default true, -- 1 byte vs variable string length
|
||||
price numeric(10,2) -- Exact decimal arithmetic
|
||||
);
|
||||
```
|
||||
|
||||
Key guidelines:
|
||||
|
||||
```sql
|
||||
-- IDs: use bigint, not int (future-proofing)
|
||||
-- Strings: use text, not varchar(n) unless constraint needed
|
||||
-- Time: use timestamptz, not timestamp
|
||||
-- Money: use numeric, not float (precision matters)
|
||||
-- Enums: use text with check constraint or create enum type
|
||||
```
|
||||
|
||||
Reference: [Data Types](https://www.postgresql.org/docs/current/datatype.html)
|
||||
@@ -0,0 +1,59 @@
|
||||
---
|
||||
title: Index Foreign Key Columns
|
||||
impact: HIGH
|
||||
impactDescription: 10-100x faster JOINs and CASCADE operations
|
||||
tags: foreign-key, indexes, joins, schema
|
||||
---
|
||||
|
||||
## Index Foreign Key Columns
|
||||
|
||||
Postgres does not automatically index foreign key columns. Missing indexes cause slow JOINs and CASCADE operations.
|
||||
|
||||
**Incorrect (unindexed foreign key):**
|
||||
|
||||
```sql
|
||||
create table orders (
|
||||
id bigint generated always as identity primary key,
|
||||
customer_id bigint references customers(id) on delete cascade,
|
||||
total numeric(10,2)
|
||||
);
|
||||
|
||||
-- No index on customer_id!
|
||||
-- JOINs and ON DELETE CASCADE both require full table scan
|
||||
select * from orders where customer_id = 123; -- Seq Scan
|
||||
delete from customers where id = 123; -- Locks table, scans all orders
|
||||
```
|
||||
|
||||
**Correct (indexed foreign key):**
|
||||
|
||||
```sql
|
||||
create table orders (
|
||||
id bigint generated always as identity primary key,
|
||||
customer_id bigint references customers(id) on delete cascade,
|
||||
total numeric(10,2)
|
||||
);
|
||||
|
||||
-- Always index the FK column
|
||||
create index orders_customer_id_idx on orders (customer_id);
|
||||
|
||||
-- Now JOINs and cascades are fast
|
||||
select * from orders where customer_id = 123; -- Index Scan
|
||||
delete from customers where id = 123; -- Uses index, fast cascade
|
||||
```
|
||||
|
||||
Find missing FK indexes:
|
||||
|
||||
```sql
|
||||
select
|
||||
conrelid::regclass as table_name,
|
||||
a.attname as fk_column
|
||||
from pg_constraint c
|
||||
join pg_attribute a on a.attrelid = c.conrelid and a.attnum = any(c.conkey)
|
||||
where c.contype = 'f'
|
||||
and not exists (
|
||||
select 1 from pg_index i
|
||||
where i.indrelid = c.conrelid and a.attnum = any(i.indkey)
|
||||
);
|
||||
```
|
||||
|
||||
Reference: [Foreign Keys](https://www.postgresql.org/docs/current/ddl-constraints.html#DDL-CONSTRAINTS-FK)
|
||||
@@ -0,0 +1,55 @@
|
||||
---
|
||||
title: Use Lowercase Identifiers for Compatibility
|
||||
impact: MEDIUM
|
||||
impactDescription: Avoid case-sensitivity bugs with tools, ORMs, and AI assistants
|
||||
tags: naming, identifiers, case-sensitivity, schema, conventions
|
||||
---
|
||||
|
||||
## Use Lowercase Identifiers for Compatibility
|
||||
|
||||
PostgreSQL folds unquoted identifiers to lowercase. Quoted mixed-case identifiers require quotes forever and cause issues with tools, ORMs, and AI assistants that may not recognize them.
|
||||
|
||||
**Incorrect (mixed-case identifiers):**
|
||||
|
||||
```sql
|
||||
-- Quoted identifiers preserve case but require quotes everywhere
|
||||
CREATE TABLE "Users" (
|
||||
"userId" bigint PRIMARY KEY,
|
||||
"firstName" text,
|
||||
"lastName" text
|
||||
);
|
||||
|
||||
-- Must always quote or queries fail
|
||||
SELECT "firstName" FROM "Users" WHERE "userId" = 1;
|
||||
|
||||
-- This fails - Users becomes users without quotes
|
||||
SELECT firstName FROM Users;
|
||||
-- ERROR: relation "users" does not exist
|
||||
```
|
||||
|
||||
**Correct (lowercase snake_case):**
|
||||
|
||||
```sql
|
||||
-- Unquoted lowercase identifiers are portable and tool-friendly
|
||||
CREATE TABLE users (
|
||||
user_id bigint PRIMARY KEY,
|
||||
first_name text,
|
||||
last_name text
|
||||
);
|
||||
|
||||
-- Works without quotes, recognized by all tools
|
||||
SELECT first_name FROM users WHERE user_id = 1;
|
||||
```
|
||||
|
||||
Common sources of mixed-case identifiers:
|
||||
|
||||
```sql
|
||||
-- ORMs often generate quoted camelCase - configure them to use snake_case
|
||||
-- Migrations from other databases may preserve original casing
|
||||
-- Some GUI tools quote identifiers by default - disable this
|
||||
|
||||
-- If stuck with mixed-case, create views as a compatibility layer
|
||||
CREATE VIEW users AS SELECT "userId" AS user_id, "firstName" AS first_name FROM "Users";
|
||||
```
|
||||
|
||||
Reference: [Identifiers and Key Words](https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-IDENTIFIERS)
|
||||
55
skills/postgres-best-practices/rules/schema-partitioning.md
Normal file
55
skills/postgres-best-practices/rules/schema-partitioning.md
Normal file
@@ -0,0 +1,55 @@
|
||||
---
|
||||
title: Partition Large Tables for Better Performance
|
||||
impact: MEDIUM-HIGH
|
||||
impactDescription: 5-20x faster queries and maintenance on large tables
|
||||
tags: partitioning, large-tables, time-series, performance
|
||||
---
|
||||
|
||||
## Partition Large Tables for Better Performance
|
||||
|
||||
Partitioning splits a large table into smaller pieces, improving query performance and maintenance operations.
|
||||
|
||||
**Incorrect (single large table):**
|
||||
|
||||
```sql
|
||||
create table events (
|
||||
id bigint generated always as identity,
|
||||
created_at timestamptz,
|
||||
data jsonb
|
||||
);
|
||||
|
||||
-- 500M rows, queries scan everything
|
||||
select * from events where created_at > '2024-01-01'; -- Slow
|
||||
vacuum events; -- Takes hours, locks table
|
||||
```
|
||||
|
||||
**Correct (partitioned by time range):**
|
||||
|
||||
```sql
|
||||
create table events (
|
||||
id bigint generated always as identity,
|
||||
created_at timestamptz not null,
|
||||
data jsonb
|
||||
) partition by range (created_at);
|
||||
|
||||
-- Create partitions for each month
|
||||
create table events_2024_01 partition of events
|
||||
for values from ('2024-01-01') to ('2024-02-01');
|
||||
|
||||
create table events_2024_02 partition of events
|
||||
for values from ('2024-02-01') to ('2024-03-01');
|
||||
|
||||
-- Queries only scan relevant partitions
|
||||
select * from events where created_at > '2024-01-15'; -- Only scans events_2024_01+
|
||||
|
||||
-- Drop old data instantly
|
||||
drop table events_2023_01; -- Instant vs DELETE taking hours
|
||||
```
|
||||
|
||||
When to partition:
|
||||
|
||||
- Tables > 100M rows
|
||||
- Time-series data with date-based queries
|
||||
- Need to efficiently drop old data
|
||||
|
||||
Reference: [Table Partitioning](https://www.postgresql.org/docs/current/ddl-partitioning.html)
|
||||
61
skills/postgres-best-practices/rules/schema-primary-keys.md
Normal file
61
skills/postgres-best-practices/rules/schema-primary-keys.md
Normal file
@@ -0,0 +1,61 @@
|
||||
---
|
||||
title: Select Optimal Primary Key Strategy
|
||||
impact: HIGH
|
||||
impactDescription: Better index locality, reduced fragmentation
|
||||
tags: primary-key, identity, uuid, serial, schema
|
||||
---
|
||||
|
||||
## Select Optimal Primary Key Strategy
|
||||
|
||||
Primary key choice affects insert performance, index size, and replication
|
||||
efficiency.
|
||||
|
||||
**Incorrect (problematic PK choices):**
|
||||
|
||||
```sql
|
||||
-- identity is the SQL-standard approach
|
||||
create table users (
|
||||
id serial primary key -- Works, but IDENTITY is recommended
|
||||
);
|
||||
|
||||
-- Random UUIDs (v4) cause index fragmentation
|
||||
create table orders (
|
||||
id uuid default gen_random_uuid() primary key -- UUIDv4 = random = scattered inserts
|
||||
);
|
||||
```
|
||||
|
||||
**Correct (optimal PK strategies):**
|
||||
|
||||
```sql
|
||||
-- Use IDENTITY for sequential IDs (SQL-standard, best for most cases)
|
||||
create table users (
|
||||
id bigint generated always as identity primary key
|
||||
);
|
||||
|
||||
-- For distributed systems needing UUIDs, use UUIDv7 (time-ordered)
|
||||
-- Requires pg_uuidv7 extension: create extension pg_uuidv7;
|
||||
create table orders (
|
||||
id uuid default uuid_generate_v7() primary key -- Time-ordered, no fragmentation
|
||||
);
|
||||
|
||||
-- Alternative: time-prefixed IDs for sortable, distributed IDs (no extension needed)
|
||||
create table events (
|
||||
id text default concat(
|
||||
to_char(now() at time zone 'utc', 'YYYYMMDDHH24MISSMS'),
|
||||
gen_random_uuid()::text
|
||||
) primary key
|
||||
);
|
||||
```
|
||||
|
||||
Guidelines:
|
||||
|
||||
- Single database: `bigint identity` (sequential, 8 bytes, SQL-standard)
|
||||
- Distributed/exposed IDs: UUIDv7 (requires pg_uuidv7) or ULID (time-ordered, no
|
||||
fragmentation)
|
||||
- `serial` works but `identity` is SQL-standard and preferred for new
|
||||
applications
|
||||
- Avoid random UUIDs (v4) as primary keys on large tables (causes index
|
||||
fragmentation)
|
||||
|
||||
Reference:
|
||||
[Identity Columns](https://www.postgresql.org/docs/current/sql-createtable.html#SQL-CREATETABLE-PARMS-GENERATED-IDENTITY)
|
||||
54
skills/postgres-best-practices/rules/security-privileges.md
Normal file
54
skills/postgres-best-practices/rules/security-privileges.md
Normal file
@@ -0,0 +1,54 @@
|
||||
---
|
||||
title: Apply Principle of Least Privilege
|
||||
impact: MEDIUM
|
||||
impactDescription: Reduced attack surface, better audit trail
|
||||
tags: privileges, security, roles, permissions
|
||||
---
|
||||
|
||||
## Apply Principle of Least Privilege
|
||||
|
||||
Grant only the minimum permissions required. Never use superuser for application queries.
|
||||
|
||||
**Incorrect (overly broad permissions):**
|
||||
|
||||
```sql
|
||||
-- Application uses superuser connection
|
||||
-- Or grants ALL to application role
|
||||
grant all privileges on all tables in schema public to app_user;
|
||||
grant all privileges on all sequences in schema public to app_user;
|
||||
|
||||
-- Any SQL injection becomes catastrophic
|
||||
-- drop table users; cascades to everything
|
||||
```
|
||||
|
||||
**Correct (minimal, specific grants):**
|
||||
|
||||
```sql
|
||||
-- Create role with no default privileges
|
||||
create role app_readonly nologin;
|
||||
|
||||
-- Grant only SELECT on specific tables
|
||||
grant usage on schema public to app_readonly;
|
||||
grant select on public.products, public.categories to app_readonly;
|
||||
|
||||
-- Create role for writes with limited scope
|
||||
create role app_writer nologin;
|
||||
grant usage on schema public to app_writer;
|
||||
grant select, insert, update on public.orders to app_writer;
|
||||
grant usage on sequence orders_id_seq to app_writer;
|
||||
-- No DELETE permission
|
||||
|
||||
-- Login role inherits from these
|
||||
create role app_user login password 'xxx';
|
||||
grant app_writer to app_user;
|
||||
```
|
||||
|
||||
Revoke public defaults:
|
||||
|
||||
```sql
|
||||
-- Revoke default public access
|
||||
revoke all on schema public from public;
|
||||
revoke all on all tables in schema public from public;
|
||||
```
|
||||
|
||||
Reference: [Roles and Privileges](https://supabase.com/blog/postgres-roles-and-privileges)
|
||||
50
skills/postgres-best-practices/rules/security-rls-basics.md
Normal file
50
skills/postgres-best-practices/rules/security-rls-basics.md
Normal file
@@ -0,0 +1,50 @@
|
||||
---
|
||||
title: Enable Row Level Security for Multi-Tenant Data
|
||||
impact: MEDIUM-HIGH
|
||||
impactDescription: Database-enforced tenant isolation, prevent data leaks
|
||||
tags: rls, row-level-security, multi-tenant, security
|
||||
---
|
||||
|
||||
## Enable Row Level Security for Multi-Tenant Data
|
||||
|
||||
Row Level Security (RLS) enforces data access at the database level, ensuring users only see their own data.
|
||||
|
||||
**Incorrect (application-level filtering only):**
|
||||
|
||||
```sql
|
||||
-- Relying only on application to filter
|
||||
select * from orders where user_id = $current_user_id;
|
||||
|
||||
-- Bug or bypass means all data is exposed!
|
||||
select * from orders; -- Returns ALL orders
|
||||
```
|
||||
|
||||
**Correct (database-enforced RLS):**
|
||||
|
||||
```sql
|
||||
-- Enable RLS on the table
|
||||
alter table orders enable row level security;
|
||||
|
||||
-- Create policy for users to see only their orders
|
||||
create policy orders_user_policy on orders
|
||||
for all
|
||||
using (user_id = current_setting('app.current_user_id')::bigint);
|
||||
|
||||
-- Force RLS even for table owners
|
||||
alter table orders force row level security;
|
||||
|
||||
-- Set user context and query
|
||||
set app.current_user_id = '123';
|
||||
select * from orders; -- Only returns orders for user 123
|
||||
```
|
||||
|
||||
Policy for authenticated role:
|
||||
|
||||
```sql
|
||||
create policy orders_user_policy on orders
|
||||
for all
|
||||
to authenticated
|
||||
using (user_id = auth.uid());
|
||||
```
|
||||
|
||||
Reference: [Row Level Security](https://supabase.com/docs/guides/database/postgres/row-level-security)
|
||||
@@ -0,0 +1,57 @@
|
||||
---
|
||||
title: Optimize RLS Policies for Performance
|
||||
impact: HIGH
|
||||
impactDescription: 5-10x faster RLS queries with proper patterns
|
||||
tags: rls, performance, security, optimization
|
||||
---
|
||||
|
||||
## Optimize RLS Policies for Performance
|
||||
|
||||
Poorly written RLS policies can cause severe performance issues. Use subqueries and indexes strategically.
|
||||
|
||||
**Incorrect (function called for every row):**
|
||||
|
||||
```sql
|
||||
create policy orders_policy on orders
|
||||
using (auth.uid() = user_id); -- auth.uid() called per row!
|
||||
|
||||
-- With 1M rows, auth.uid() is called 1M times
|
||||
```
|
||||
|
||||
**Correct (wrap functions in SELECT):**
|
||||
|
||||
```sql
|
||||
create policy orders_policy on orders
|
||||
using ((select auth.uid()) = user_id); -- Called once, cached
|
||||
|
||||
-- 100x+ faster on large tables
|
||||
```
|
||||
|
||||
Use security definer functions for complex checks:
|
||||
|
||||
```sql
|
||||
-- Create helper function (runs as definer, bypasses RLS)
|
||||
create or replace function is_team_member(team_id bigint)
|
||||
returns boolean
|
||||
language sql
|
||||
security definer
|
||||
set search_path = ''
|
||||
as $$
|
||||
select exists (
|
||||
select 1 from public.team_members
|
||||
where team_id = $1 and user_id = (select auth.uid())
|
||||
);
|
||||
$$;
|
||||
|
||||
-- Use in policy (indexed lookup, not per-row check)
|
||||
create policy team_orders_policy on orders
|
||||
using ((select is_team_member(team_id)));
|
||||
```
|
||||
|
||||
Always add indexes on columns used in RLS policies:
|
||||
|
||||
```sql
|
||||
create index orders_user_id_idx on orders (user_id);
|
||||
```
|
||||
|
||||
Reference: [RLS Performance](https://supabase.com/docs/guides/database/postgres/row-level-security#rls-performance-recommendations)
|
||||
Reference in New Issue
Block a user