Loading...
Loading...
5 posts filed under “deployment”
AI evaluations work great in single-turn labs but crumble in the multi-turn conversations that define real AI usage.
Most AI evals companies built PLG products that can't see how companies actually deploy AI, leading to evaluations that are dangerously wrong.
"We can't deploy this to production. It touches payment processing." The security team was right to be cautious.
Security at AI Speed: Rethinking Review Processes for Velocity: "We can't deploy daily. What about our security review process?" The CISO's concern was valid.
_This is part 1 of a series on building production-ready infrastructure. Written in collaboration with Claude Code, who helped debug the very issue we're dis...