Skip to content

Eval · Promptfoo

Promptfoo

Open-source LLM eval CLI. Rubric scoring + golden sets.

OPEN SOURCECLImacOSWindowsLinuxVetted

YAML-driven eval harness. Pair a prompt with a goldset, define rubrics, run across multiple models in CI. Strong for catching prompt regressions before they hit production.

Model support

BYO key / model

  • Claude
  • GPT
  • Gemini
  • Local

Where it runs

  • CLI
  • macOS
  • Windows
  • Linux

Tags

  • #eval
  • #ci
  • #rubric
  • #open-source
Open PromptfooGitHubDocs

Related in Eval

  • View Braintrust details
    EvalFREEMIUM

    Braintrust

    Braintrust

    Hosted eval + tracing platform for LLM apps.

    Production-grade eval orchestration with a dashboard, dataset versioning, and OpenTelemetry tracing. Useful once eval volume outgrows a CI YAML file.

    • eval
    • tracing
    • datasets
    • production