Files
supabase-postgres-best-prac…/packages/evals/docker-compose.yml
Pedro Rodrigues 9b08864e94 feat(evals): replace mock CLIs with real Supabase instance per eval run
Start a shared local Supabase stack once before all scenarios and reset
the database (drop/recreate public schema + clear migration history) between
each run. This lets agents apply migrations via `supabase db push` against a
real Postgres instance instead of mock shell scripts.

- Add supabase-setup.ts: startSupabase / stopSupabase / resetDB / getKeys
- Update runner.ts to start/stop Supabase and inject keys into process.env
- Update agent.ts to point MCP config at the local Supabase HTTP endpoint
- Update preflight.ts to check supabase CLI availability and Docker socket
- Update scaffold.ts to seed workspace with supabase/config.toml
- Add passThreshold support (test.ts / results.ts / types.ts) for partial pass
- Delete mock shell scripts (mocks/docker, mocks/psql, mocks/supabase)
- Update Dockerfile/docker-compose to mount Docker socket for supabase CLI

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-25 14:39:54 +00:00

24 lines
907 B
YAML

services:
evals:
build:
context: ../..
dockerfile: packages/evals/Dockerfile
args:
# Match the host's docker group GID so the node user can reach the socket.
# Override with: DOCKER_GID=$(getent group docker | cut -d: -f3) docker compose up
DOCKER_GID: "${DOCKER_GID:-999}"
environment:
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
- EVAL_MODEL=${EVAL_MODEL:-}
- EVAL_SCENARIO=${EVAL_SCENARIO:-}
- EVAL_BASELINE=${EVAL_BASELINE:-}
- EVAL_SKILL=${EVAL_SKILL:-}
- BRAINTRUST_UPLOAD=${BRAINTRUST_UPLOAD:-}
- BRAINTRUST_API_KEY=${BRAINTRUST_API_KEY:-}
- BRAINTRUST_PROJECT_ID=${BRAINTRUST_PROJECT_ID:-}
- EVAL_RESULTS_DIR=/app/results
volumes:
- ./results:/app/results
# Mount the host Docker socket so the supabase CLI can manage containers.
- /var/run/docker.sock:/var/run/docker.sock