About

I’m building EvalOps, a system for making AI work legible enough to trust. The part I care about is not the demo. It is the operating loop after the demo: catching regressions, learning from feedback, reviewing agent work, and shipping behavior changes with evidence.

My background across security, infrastructure, product, and startups shapes how I think about trust, failure modes, and production reality. Before this I co-founded ThreatKey and worked at Vanta, Carta, and Snap.

The current thesis: agents will matter most when they can be measured, corrected, delegated to, and interrupted without drama. That means evals, memory, provenance, review, and taste are not supporting details. They are the product.

Jonathan Haas

Founder & CEO, EvalOps

2025 -

Building the operating layer for accountable AI work: evaluation infrastructure, agent reliability, regression detection, and evidence-based behavior changes.

Senior Staff Security Engineer, Writer

2025 -

Building Cerebro, an open-source operations data platform for cloud, SaaS, and security posture management. Policy engine, multi-cloud scanning, AI-powered investigation, and compliance automation.

github.com/writer/cerebro

Senior Product Manager, Vanta

2024 - 2025

Joined via ThreatKey. Led security integrations across cloud, code, and infrastructure platforms. Worked on partnerships with Wiz, CrowdStrike, GitHub, GitLab, and others.

Co-founder & CEO, ThreatKey

2020 - 2024

Built a SaaS security posture management platform. Identified misconfigurations and vulnerabilities across cloud infrastructure and business tools (AWS, GCP, Google Workspace, Microsoft 365). Self-service onboarding — customers could connect integrations and surface security findings in under a minute.

Lead, Security Operations, Carta

2020 - 2021

Built security operations from the ground up. Implemented incident response protocols. Left to go full-time on ThreatKey.

Prior to that, security engineering roles at Lockheed Martin, DoorDash, and Snapchat (2016–2020), and internships from 2013–2016. Been writing code since I should have been playing N64.

I run EvalOps, a small research lab focused on agent reliability and evaluation infrastructure.