Build, Evaluate, and Improve AI That Works in Production
The most field-tested evals course available. We’ve refined it over a year of cohorts with 4,500+ engineers and PMs from teams like OpenAI, Google, Meta, and Amazon.
See next cohort dates & enroll AI Evals For Engineers & PMs · $3,150
Is this you?
- You’re spot-checking AI outputs by hand, and you know it won’t scale.
- You’re scared to ship a prompt or model change because you don’t trust your metrics.
- Leadership keeps asking “is this actually better?” and you don’t have an answer you trust.
- You’ve hired smart people and bought tools, and you’re still guessing whether your AI is getting better or worse.
If so, you’re in the right place. We teach the system for fixing it.
The course: AI Evals For Engineers & PMs
Build a real AI agent, find where it breaks, and improve it with evals you can trust, working the full loop hands-on. Over the cohort you will:
- Replace random spot-checking with a repeatable way to read traces and find failures.
- Design and validate LLM-as-judge and code-based evaluators that match expert judgment.
- Wire evals into CI/CD so prompt, model, and tool changes get checked before they ship.
- Run experiments that raise accuracy and cut cost, and prove which change moved the metric.
You keep lifetime access to all recordings and materials, a 150+ page course reader, 10+ hours of office hours, a private Discord community, and a seat in every future cohort.
Backed by the Maven Guarantee: a full refund through the first two weeks of the cohort, no questions asked.
See next cohort dates & enroll AI Evals For Engineers & PMs · $3,150
Already taken the course and shipping AI in production? Join our AI Engineering Community