mirror of
https://github.com/supabase/agent-skills.git
synced 2026-03-27 10:09:26 +08:00
feat(evals): enrich Braintrust upload with granular scores and tracing
Add per-test pass/fail parsing from vitest verbose output, thread prompt content and individual test results through the runner, and rewrite uploadToBraintrust with experiment naming (model-variant-timestamp), granular scores (pass, test_pass_rate, per-test), rich metadata, and tool-call tracing via experiment.traced(). Also document --force flag for cached mise tasks and add Braintrust env vars to AGENTS.md. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -1,5 +1,5 @@
|
||||
import { mkdirSync, readdirSync, statSync, writeFileSync } from "node:fs";
|
||||
import { join, resolve } from "node:path";
|
||||
import { readdirSync, statSync } from "node:fs";
|
||||
import { join } from "node:path";
|
||||
import type { EvalRunResult } from "../types.js";
|
||||
|
||||
/**
|
||||
|
||||
Reference in New Issue
Block a user