Jonathan Haaswritingnowusesabout
emailgithubx
Jonathan Haaswritingnowusesabout

Building Kestrel: A Context-Aware AI Desktop Assistant in One Session

March 24, 2026·4 min read

How I built a full LittleBird clone with screen context reading, meeting recording, arena mode, and MCP tool support — from scratch to packaged .app in a single coding session.

#ai#electron#macos#developer-experience#building-in-public

I wanted a personal AI assistant that knew what I was working on. Not one that required me to paste context into a chat window. One that could read my screen, understand what app I had open, and give contextually relevant answers.

LittleBird does this. It costs $20/month and sends everything to their servers. I wanted the same thing, local-first, with my own API keys.

So I built Kestrel.

What It Does

Kestrel is an Electron app that runs a native Swift helper binary. The Swift binary uses macOS Accessibility APIs to read the UI hierarchy of whatever app is in the foreground — window titles, text content, browser URLs. This context is injected into every AI conversation as a system prompt.

When you ask "what am I working on?", it already knows. It can see your terminal output, your browser tabs, your editor content. No copy-pasting required.

Beyond context-aware chat, it does:

  • Meeting recording — CoreAudio process taps capture system audio, AVAudioEngine captures your mic, both resampled to 16kHz WAV and sent to Whisper for transcription. AI-generated summaries and action items afterward.
  • Auto-detect meetings — Polls CoreAudio's kAudioProcessPropertyIsRunningInput to detect when Zoom, Meet, or Teams grab the microphone. Starts recording automatically, stops with a 30-second grace period.
  • Arena mode — Send the same prompt to 2-4 models simultaneously and compare responses side by side.
  • MCP integration — Claude Desktop-compatible config format. Connect any MCP server, tools get injected into the AI's system prompt and executed via a tool-calling loop.
  • Journal — AI-generated daily entries from context snapshots saved throughout the day.
  • Quick access overlay — Cmd+Shift+Space slides in a panel from the right edge.

Architecture

The interesting technical decisions:

Split agent architecture. The AI pipeline has two agents: an Executor that handles tool calls, context fetching, and API requests, and a Presenter that handles streaming to the UI, database persistence, and user-facing formatting. They have separate concerns and separate wide events for observability. The Executor never touches the renderer. The Presenter never calls an API.

Native Swift CLI over stdin/stdout. ContextKit is a Swift Package Manager project that communicates with Electron via JSON-RPC 2.0 over NDJSON. This is the same protocol as MCP and LSP. The binary runs with dispatchMain() keeping the main RunLoop alive for AudioQueue callbacks, while the JSON-RPC reader runs on a background thread.

CoreAudio process taps. For meeting recording, the app creates a CATapDescription and an aggregate audio device via AudioHardwareCreateAggregateDevice. This captures all system audio without requiring Screen Recording permission — only the purple audio indicator dot appears. Microphone capture uses AudioQueue on a dedicated thread with its own CFRunLoop.

CFRunLoopPerformBlock for AX calls. Accessibility API calls hang on background threads in packaged macOS apps. The fix: dispatch them to the main CFRunLoop via CFRunLoopPerformBlock, which works with RunLoop.main.run() unlike DispatchQueue.main.async (which requires dispatchMain() or NSApplication).

Wide events. Every operation emits a structured event — chat messages, tool calls, meeting detections, context captures. These go into a SQLite table and an in-memory ring buffer with real-time analytics (event counts, error rates, avg durations). There's an Event Log viewer in Settings that shows live events.

The Hardest Bugs

Preload .js vs .mjs. electron-vite compiles preload scripts to .mjs but the window configs referenced .js. This meant window.api was undefined in every renderer, and every IPC call silently failed. Chat didn't work. Context didn't work. Nothing worked. The fix was three characters across three files.

AVAudioConverter.reset(). The audio converter produced output on the first call, then returned empty buffers forever after. The inputBlock returns .endOfStream, which leaves the converter in a terminal state. One converter.reset() call before each conversion fixed everything. Meeting recordings went from 7KB (empty) to 158KB (5 seconds of real audio).

AX calls from background threads. Works in dev mode, hangs in the packaged app. Three approaches tried: DispatchQueue.main.async (deadlocks — GCD main queue not processed by CFRunLoop), direct background calls (hang in packaged binary), CFRunLoopPerformBlock (works). This took four iterations to get right.

Stack

LayerChoice
RuntimeElectron 34, React 19, TypeScript
Buildelectron-vite 5, Vite 6
UIshadcn/ui, Tailwind CSS v4
StateMobX
DatabaseSQLite (better-sqlite3 + Drizzle ORM)
AIOpenRouter (all models), OpenAI Whisper
ContextNative Swift CLI via JSON-RPC
AudioCoreAudio taps, AudioQueue, AVAudioConverter
ToolsModel Context Protocol (MCP)

What I'd Do Differently

Start with the packaged app from day one. Half the bugs were dev-mode-vs-production differences that only surfaced late. The preload extension, the AX thread safety, the native module bundling — all of these would have been caught earlier.

Also: code signing. Every time you copy a new unsigned app to /Applications, macOS resets the Accessibility permission. During development I must have re-granted it fifty times. A proper Developer ID certificate would fix this permanently.

Try It

Kestrel is open source at github.com/haasonsaas/kestrel. It's a personal tool — no auth, no backend, no subscriptions. Bring your own OpenRouter API key.

git clone https://github.com/haasonsaas/kestrel
cd kestrel
npm install
npm run ContextKit:build
npm run dev

A kestrel is a falcon that hovers in place, scanning the ground below. That's what this app does — hovers over your work, watching what you're doing, ready to help when you ask.

share

Continue reading

The CLI Renaissance: How AI is Driving the Command Line Revolution

AI coding assistants output shell commands, not GUI instructions. That single fact is reversing a decade of developer tooling trends.

Building for Humans AND Machines: The Dual-Audience Problem

Every web design decision now must serve two audiences: humans who browse visually and AI agents that consume data programmatically. The architectural...

When AI Learns to Write Like You: A Meta-Analysis

I asked Claude to analyze my writing style across my blog posts. The patterns it found -- and the ones I didn't know I had -- were genuinely surprising.

emailgithubx