#solution-architecture
1 post filed under “solution-architecture”
The AI Evals Rebuild: How to Actually Test AI Systems
After exposing what's broken with AI evaluation, here's the radical solution: throw out benchmarks and test in production reality.