Jonathan Haas
I build agentic systems and evaluation infrastructure for LLM products. Previously co-founded ThreatKey and worked at Vanta, Carta, and Snap.
Usually in San Francisco. Always down for a walk. Find me by email, GitHub, or X.
Now: trying to make AI systems feel less like demos and more like accountable coworkers.
Currently thinking about
- Agents and evals14 posts
How to make autonomous work legible enough to trust.
- Product judgment13 posts
Why the tiny product decisions are usually the real product.
- Founder lessons13 posts
A map of founder traps that mostly look reasonable while you are inside them.
- Security and systems12 posts
Risk is usually a systems-design problem wearing a policy costume.
- Personal systems11 posts
Software gets more interesting when it is allowed to fit one person exactly.
Best entry points
- If you want the agent systems thread1 min
Orchestrating AI Coding Agents: What I Learned Running Three Autonomous Sessions at Once
- If you want the product taste thread1 min
Somebody Gave a Shit: The Quiet Power of Product Detail
- If you want the personal tools thread1 min
The Rise of Single-Serving Software
- If you want the founder judgment thread1 min
The Three Types of Startup Advice (And Why They're All Wrong)
Recent writing
- Orchestrating AI Coding Agents: What I Learned Running Three Autonomous Sessions at Once
I ran three concurrent AI coding agents across four repos. They shipped 20+ PRs, wrote 100+ posts, and handled real review and CI work.
- Building Kestrel: A Context-Aware AI Desktop Assistant in One Session
How I built a full LittleBird clone with screen context reading, meeting recording, arena mode, and MCP tool support — from scratch to packaged .app in a single coding session.
- DiffScope: What Happens When You Give a Code Review Agent Real Context
Most AI review tools see a diff. DiffScope sees the diff, the callers, the type hierarchy, the team history, and knows when to shut up. Here is how.
- The 10-Minute AI POC That Becomes a 10-Month Nightmare
Five lines of Python and an API key produce a working demo. The gap between that demo and a production system contains failure modes the prototype...
- Why Your AI Strategy is Actually a Spreadsheet Strategy
Most enterprise AI transformations are solving problems that spreadsheets handle at 1/50th the cost. The misalignment is driven by career incentives,...
- The AI Agent Gold Rush: Why Everyone's Building Picks and Shovels
Most AI agent infrastructure is premature. The agents themselves barely work. The industry is selling Formula 1 equipment to people still learning to...