Start free. Scale when you need to. No usage surprises.
One eval = one scenario run against your agent. If you have 20 scenarios and run your eval suite once, that's 20 evals.
Any framework — LangChain, LlamaIndex, CrewAI, AutoGen, or fully custom agents in Python or Node. If your agent has an HTTP endpoint or can be called as a subprocess, it works.
On the Pro plan, eval outputs are retained for 90 days to power regression diffs and score history. You can delete your data at any time. Enterprise customers can choose private cloud deployment.
Yes. The CLI runner can operate fully local — evals run on your machine, scores stay on your machine. The cloud dashboard is optional.
14 days free, no credit card required. If you haven't set up CI integration by day 14, we'll extend it — we want you to see the value before you pay.
50 evals/month, no credit card. Upgrade when you're shipping to production.
Get started free