Loading…
Eval · Exploding Gradients
Open-source evaluation toolkit for RAG and LLM applications.
Open-source (Apache-2.0) Python framework for evaluating retrieval-augmented generation and LLM apps. Provides reference-free metrics — faithfulness, answer relevancy, context precision/recall — plus knowledge-graph-based synthetic test generation. Integrates with LangChain, LlamaIndex, and CI pipelines.
Model support
Bring any provider/model for the LLM-as-judge metrics.
Where it runs
Tags