Eval · Exploding Gradients

Ragas

Open-source evaluation toolkit for RAG and LLM applications.

FREEOpen sourceCLIAPI

Open-source (Apache-2.0) Python framework for evaluating retrieval-augmented generation and LLM apps. Provides reference-free metrics — faithfulness, answer relevancy, context precision/recall — plus knowledge-graph-based synthetic test generation. Integrates with LangChain, LlamaIndex, and CI pipelines.

Model support

BYO key / model

Bring any provider/model for the LLM-as-judge metrics.

Where it runs

Ragas

Open-source evaluation toolkit for RAG and LLM applications.

FREEOpen sourceCLIAPI

Model support

BYO key / model

Bring any provider/model for the LLM-as-judge metrics.

Where it runs

Ragas

BYO key / model

DeepEval

Patronus AI

Braintrust

Promptfoo

Ragas

BYO key / model

DeepEval

Patronus AI

Braintrust

Promptfoo