Production-grade eval orchestration with a dashboard, dataset versioning, and OpenTelemetry tracing. Useful once eval volume outgrows a CI YAML file.
Eval · Braintrust
Braintrust
Hosted eval + tracing platform for LLM apps.
FREEMIUMWebAPI
Model support
BYO key / model
- Claude
- GPT
- Gemini
- Custom
Where it runs
- Web
- API
Tags
- #eval
- #tracing
- #datasets
- #production
Related in Eval
View Promptfoo details EvalOPEN SOURCEVettedPromptfoo
Promptfoo
Open-source LLM eval CLI. Rubric scoring + golden sets.
YAML-driven eval harness. Pair a prompt with a goldset, define rubrics, run across multiple models in CI. Strong for catching prompt regressions before they hit production.
- eval
- ci
- rubric
- open-source