← back home

now.

updated september 2025

this page is the running log of where my time goes. if you're curious about working together—or just want to compare notes—you can see what's actually on the bench right now.

home base

san francisco ↔ new york

currently

shipping evaluation guardrails at evalops

availability

one advisory slot open for q4 2025

what i'm shipping

the work on my calendar this month. most of it blends research, evaluation, and operations.

evalops lab sprints

embedding with product teams to turn flaky evaluation suites into reliable guardrails.

model drift playbook

shipping a toolkit that watches staged rollouts and flags behavior regressions in real time.

aiscan

maintaining an open-source rust scanner that catches auth and data-leak risks in AI pipelines.

what i'm reading

the ideas shaping how i build right now. i keep a longer list on the reading page.

The Dream Machine

re-reading it to stay grounded in why we build tools for other people, not just ourselves.

Recent Anthropic red-teaming papers

pulling ideas for automated behavioral probes we can adapt for production workloads.

Engineering Management for the Rest of Us

using the hiring and feedback chapters with founders I advise.

what i'm chewing on

  • how to make eval tooling feel like CI/CD: fast, trustworthy, and boring in the best way.
  • ways to close the loop between human analysts and automated probes so each one sharpens the other.
  • whether DSPy-style self-improving evaluators can trigger rollbacks before humans notice regressions.

want the longer arc? i post monthly logs on the blog and share in-progress prototypes on github.

inspired by derek sivers' now page movement. last edit: september 2025.