Research
Notes on AI evals, LLMs, and building AI that works in production.
| Date | Title |
|---|---|
| May 21, 2026 | Exploring Agent-Assisted Qualitative Analysis |
| Mar 2, 2026 | Evals Skills for Coding Agents |
| Jan 15, 2026 | LLM Evals: Everything You Need to Know |
| Dec 1, 2025 | On the Consumption of AI-Generated Content at Scale |
| Oct 1, 2025 | Selecting The Right AI Evals Tool |
| Sep 5, 2025 | In Defense of AI Evals, for Everyone |
| Jun 23, 2025 | Inspect AI, An OSS Python Library For LLM Evals |
| Jun 16, 2025 | Writing in the Age of LLMs |
| Oct 29, 2024 | Using LLM-as-a-Judge For Evaluation: A Complete Guide |
| Jul 29, 2024 | An Open Course on LLMs, Led by Practitioners |
| Jul 1, 2024 | Data Flywheels for LLM Applications |
| Jun 24, 2024 | Short Musings on AI Engineering and “Failed AI Projects” |
| Jun 1, 2024 | What We’ve Learned From A Year of Building with LLMs |
| Apr 8, 2024 | Comparing LLMs on “Real-World” Retrieval |
| Mar 29, 2024 | Your AI Product Needs Evals |
No matching items