supabase-postgres-best-practices

mirror of https://github.com/supabase/agent-skills.git synced 2026-03-27 10:09:26 +08:00

Author	SHA1	Message	Date
Pedro Rodrigues	0894f5683e	clean supabase project and use braintrust datasets	2026-02-25 20:20:36 +00:00
Pedro Rodrigues	34e807a3f6	replace vitest for braintrust assertions	2026-02-25 19:50:54 +00:00
Pedro Rodrigues	e65642b752	remove some braintrust headers	2026-02-25 19:11:56 +00:00
Pedro Rodrigues	9b08864e94	feat(evals): replace mock CLIs with real Supabase instance per eval run Start a shared local Supabase stack once before all scenarios and reset the database (drop/recreate public schema + clear migration history) between each run. This lets agents apply migrations via `supabase db push` against a real Postgres instance instead of mock shell scripts. - Add supabase-setup.ts: startSupabase / stopSupabase / resetDB / getKeys - Update runner.ts to start/stop Supabase and inject keys into process.env - Update agent.ts to point MCP config at the local Supabase HTTP endpoint - Update preflight.ts to check supabase CLI availability and Docker socket - Update scaffold.ts to seed workspace with supabase/config.toml - Add passThreshold support (test.ts / results.ts / types.ts) for partial pass - Delete mock shell scripts (mocks/docker, mocks/psql, mocks/supabase) - Update Dockerfile/docker-compose to mount Docker socket for supabase CLI Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-25 14:39:54 +00:00
Pedro Rodrigues	2da5cae2ac	feat(evals): enrich Braintrust upload with granular scores and tracing Add per-test pass/fail parsing from vitest verbose output, thread prompt content and individual test results through the runner, and rewrite uploadToBraintrust with experiment naming (model-variant-timestamp), granular scores (pass, test_pass_rate, per-test), rich metadata, and tool-call tracing via experiment.traced(). Also document --force flag for cached mise tasks and add Braintrust env vars to AGENTS.md. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-24 13:26:48 +00:00
Pedro Rodrigues	baf94b04e3	load skills through skills CLI	2026-02-20 17:41:41 +00:00
Pedro Rodrigues	e03bc99ebb	more two scenarios and claude code cli is now a dependency	2026-02-20 15:02:59 +00:00
Pedro Rodrigues	e06a567846	workflow evals with one scenario	2026-02-19 17:06:17 +00:00

8 Commits