Jonathan Haaswritingthemesnowusesabout
emailgithubx
Jonathan Haaswritingthemesnowusesabout
March 24, 2026·4 min read

Building Kestrel: A Context-Aware AI Desktop Assistant in One Session

How I built a full LittleBird clone with screen context reading, meeting recording, arena mode, and MCP tool support — from scratch to packaged .app in a single coding session.

#ai#electron#macos#developer-experience#building-in-public

Filed under Agents and evals. The AI work I keep returning to: orchestration, feedback loops, measurable behavior, and where autonomy breaks down.

I wanted a personal AI assistant that knew what I was working on. Not one that required me to paste context into a chat window. One that could read my screen, understand what app I had open, and give contextually relevant answers.

LittleBird does this. It costs $20/month and sends everything to their servers. I wanted the same thing, local-first, with my own API keys.

So I built Kestrel.

What It Does

Kestrel is an Electron app that runs a native Swift helper binary. The Swift binary uses macOS Accessibility APIs to read the UI hierarchy of whatever app is in the foreground — window titles, text content, browser URLs. This context is injected into every AI conversation as a system prompt.

When you ask "what am I working on?", it already knows. It can see your terminal output, your browser tabs, your editor content. No copy-pasting required.

Beyond context-aware chat, it does:

  • Meeting recording — CoreAudio process taps capture system audio, AVAudioEngine captures your mic, both resampled to 16kHz WAV and sent to Whisper for transcription. AI-generated summaries and action items afterward.
  • Auto-detect meetings — Polls CoreAudio's kAudioProcessPropertyIsRunningInput to detect when Zoom, Meet, or Teams grab the microphone. Starts recording automatically, stops with a 30-second grace period.
  • Arena mode — Send the same prompt to 2-4 models simultaneously and compare responses side by side.
  • MCP integration — Claude Desktop-compatible config format. Connect any MCP server, tools get injected into the AI's system prompt and executed via a tool-calling loop.
  • Journal — AI-generated daily entries from context snapshots saved throughout the day.
  • Quick access overlay — Cmd+Shift+Space slides in a panel from the right edge.

Architecture

The interesting technical decisions:

Split agent architecture. The AI pipeline has two agents: an Executor that handles tool calls, context fetching, and API requests, and a Presenter that handles streaming to the UI, database persistence, and user-facing formatting. They have separate concerns and separate wide events for observability. The Executor never touches the renderer. The Presenter never calls an API.

Native Swift CLI over stdin/stdout. ContextKit is a Swift Package Manager project that communicates with Electron via JSON-RPC 2.0 over NDJSON. This is the same protocol as MCP and LSP. The binary runs with dispatchMain() keeping the main RunLoop alive for AudioQueue callbacks, while the JSON-RPC reader runs on a background thread.

CoreAudio process taps. For meeting recording, the app creates a CATapDescription and an aggregate audio device via AudioHardwareCreateAggregateDevice. This captures all system audio without requiring Screen Recording permission — only the purple audio indicator dot appears. Microphone capture uses AudioQueue on a dedicated thread with its own CFRunLoop.

CFRunLoopPerformBlock for AX calls. Accessibility API calls hang on background threads in packaged macOS apps. The fix: dispatch them to the main CFRunLoop via CFRunLoopPerformBlock, which works with RunLoop.main.run() unlike DispatchQueue.main.async (which requires dispatchMain() or NSApplication).

Wide events. Every operation emits a structured event — chat messages, tool calls, meeting detections, context captures. These go into a SQLite table and an in-memory ring buffer with real-time analytics (event counts, error rates, avg durations). There's an Event Log viewer in Settings that shows live events.

The Hardest Bugs

Preload .js vs .mjs. electron-vite compiles preload scripts to .mjs but the window configs referenced .js. This meant window.api was undefined in every renderer, and every IPC call silently failed. Chat didn't work. Context didn't work. Nothing worked. The fix was three characters across three files.

AVAudioConverter.reset(). The audio converter produced output on the first call, then returned empty buffers forever after. The inputBlock returns .endOfStream, which leaves the converter in a terminal state. One converter.reset() call before each conversion fixed everything. Meeting recordings went from 7KB (empty) to 158KB (5 seconds of real audio).

AX calls from background threads. Works in dev mode, hangs in the packaged app. Three approaches tried: DispatchQueue.main.async (deadlocks — GCD main queue not processed by CFRunLoop), direct background calls (hang in packaged binary), CFRunLoopPerformBlock (works). This took four iterations to get right.

Stack

LayerChoice
RuntimeElectron 34, React 19, TypeScript
Buildelectron-vite 5, Vite 6
UIshadcn/ui, Tailwind CSS v4
StateMobX
DatabaseSQLite (better-sqlite3 + Drizzle ORM)
AIOpenRouter (all models), OpenAI Whisper
ContextNative Swift CLI via JSON-RPC
AudioCoreAudio taps, AudioQueue, AVAudioConverter
ToolsModel Context Protocol (MCP)

What I'd Do Differently

Start with the packaged app from day one. Half the bugs were dev-mode-vs-production differences that only surfaced late. The preload extension, the AX thread safety, the native module bundling — all of these would have been caught earlier.

Also: code signing. Every time you copy a new unsigned app to /Applications, macOS resets the Accessibility permission. During development I must have re-granted it fifty times. A proper Developer ID certificate would fix this permanently.

Try It

Kestrel is open source at github.com/haasonsaas/kestrel. It's a personal tool — no auth, no backend, no subscriptions. Bring your own OpenRouter API key.

git clone https://github.com/haasonsaas/kestrel
cd kestrel
npm install
npm run ContextKit:build
npm run dev

A kestrel is a falcon that hovers in place, scanning the ground below. That's what this app does — hovers over your work, watching what you're doing, ready to help when you ask.

Share:
//

More in Agents and evals

Previous on this shelf: The Evaluation Infrastructure We Need: Why AI Testing is Fundamentally Broken

Next on this shelf: AI Code Review Is Reasoning, Not Pattern Matching

Open the full shelf

This connects to

The Rise of Single-Serving Software

The product philosophy underneath small personal tools.

Scaling the Me Component: How I Built an AI That Thinks Like Me

Why useful assistants need local taste and memory.

OCode: Why I Built My Own Claude Code (and Why You Might Too)

Another example of building the tool around your own workflow.

emailgithubx