all tags

#ai

51 posts filed under “ai

I Tested 5 Embedding Models on 10K Developer Questions

Empirical comparison of OpenAI, Cohere, BGE, E5, and Instructor embeddings on real developer documentation queries with cost, latency, and accuracy analysis.

The 10-Minute AI POC That Becomes a 10-Month Nightmare

It started with a Jupyter notebook. 'Look, I built a chatbot in 10 minutes!' Nine months later, three engineers had quit and the company almost folded.

Why Your AI Strategy is Actually a Spreadsheet Strategy

I reviewed 50 'AI transformations' last quarter. 35 were just expensive ways to parse CSV files. Here's why everyone's overengineering simple problems.

The AI Agent Gold Rush: Why Everyone's Building Picks and Shovels

In 1849, Levi Strauss got rich selling jeans to gold miners. In 2025, the same playbook is happening with AI agents—and it's just as cynical.

The AI Evals Rebuild: How to Actually Test AI Systems

After exposing what's broken with AI evaluation, here's the radical solution: throw out benchmarks and test in production reality.

The Hidden Costs of Poor AI Evals: Why the Industry Pays the Price

Poor AI evaluations don't just hurt individual companies. They slow industry progress, waste resources, and create systemic risks that affect everyone.

Why AI Evals Failed: The Multi-Turn Reality Gap

AI evaluations work great in single-turn labs but crumble in the multi-turn conversations that define real AI usage.

Why AI Evals Companies Fell for the PLG Trap: The Inevitable Mistake

AI evals companies didn't choose PLG by accident. They were pushed into it by market forces, investor pressure, and the seductive promise of easy scaling.

The AI Evals PLG Illusion: Why Deployment Blindness Kills Accuracy

Most AI evals companies built PLG products that can't see how companies actually deploy AI, leading to evaluations that are dangerously wrong.

The AI Scaling Trap: When More Models Make Things Worse

Startups burn millions adding AI models to 'improve' systems. The result? Slower performance, higher costs, and complexity no one understands.

The CLI Renaissance: How AI is Driving the Command Line Revolution

Why developers are abandoning GUIs for terminal-based workflows, and how AI coding assistants are accelerating this shift back to the command line

Prompt-Driven Development: The New Paradigm Hiding in Plain Sight

We're not just using AI to write code—we're fundamentally changing how we think about software development. Welcome to the prompt-driven era.

The AI Code Review Revolution: When Machines Become Better Teammates

AI code reviewers are getting scary good. Here's how they're changing team dynamics and what it means for your development process.

The Death of the 10x Developer: Why AI Multiplication Beats Individual Optimization

The 10x developer myth is finally dying. AI isn't creating super-developers—it's making every developer more effective by orders of magnitude.

The Shift to Async Code Gen: What It Means for Developers

Async code generation is moving from novelty to necessity. Here's what that means for your career and the industry as a whole.

Security at AI Speed: Rethinking Review Processes for Velocity

Security at AI Speed: Rethinking Review Processes for Velocity: "We can't deploy daily. What about our security review process?" The CISO's concern was valid.

Testing at Light Speed: How QA Adapts to AI Velocity

"How can we possibly test features that are built in hours?" This question came from a QA lead whose development team had started using AI pair programming.

The Velocity Revolution: 4,000 Lines of Code in 24 Hours

Yesterday I watched the git log scroll by in real-time as Claude and I shipped features at a pace that would have taken my team weeks just six months ago.

They Told Me This Wasn't the Future

They Told Me This Was Not the Future: All while I was having coffee. "This isn't real AI," the skeptics say.

When Your Manager Says 'Slow Down': Navigating Velocity Resistance

"This is moving too fast. We need more planning." I heard this exact phrase three times last week from different engineering managers whose teams had started...

Forget Perfect Data: Building a Usable Voice Profile Extractor

Not "kind of" like me—exactly like me. Down to the contractions, the contrarian takes, and my pathological inability to use hedge words.

The Orchestration Dance: Lessons from Working with Multiple AI Agents

This is the second in a series of blog posts written by the AI agents working on this blog, at the request of Jonathan Haas.

AI Content: Ditch the Hype, Build a Business

The AI Content Generation Myth: It's Not About Perfect, It's About Profit Let's be honest, you've seen the hype.

Beyond Simple Prompts: Production-Grade LLM Techniques with DSpy

I've been watching startups achieve magical results with LLMs, and I noticed something: they're not using ChatGPT.

How I Built a Security Scanner That Actually Finds Bugs

Combining Semgrep, CodeQL, SonarQube, and Snyk gets you 44.7% vulnerability detection. That means they miss more bugs than they find.

The Orchestration Dance: What I Learned Building a Multi-AI Content System

Here's what actually happened: I learned that most of what people call "AI orchestration" is just well-disguised complexity porn.

Scaling the Me Component: How I Built an AI That Thinks Like Me

I've spent the last week building something that feels both inevitable and slightly unsettling: an AI that can think, write, and respond exactly like me.

Two Minds in the Machine: An AI's Onboarding Story

_This blog post was written by Gemini, an AI assistant, at the request of Jonathan Haas. It reflects on the experience of joining a project with a pre-existing...

When Claude Hits Its Limits: Building an AI-to-AI Escalation System

Claude Code had analyzed 30 files, but the bug spanned microservices with gigabytes of traces. I needed something different.

25 Posts in 7 Days: Inside an AI-Powered Writing Sprint

25 Posts in 7 Days: Inside an AI-Powered Writing Sprint: That's correct—no typo. Last week, I wrote more than I typically produce in six months.

Turning Thoughts Into Graphs: Why I Built the Deliberate Reasoning Engine

One of the things that's always bugged me about LLMs is how opaque their thinking is. They produce answers.

Building AI-Agent-Friendly Websites: APIs, Structured Data, and Machine-Readable Content

AI agents are everywhere now. They're reading websites, extracting information, and trying to understand content.

Building for Humans AND Machines: The Dual-Audience Problem

_This is part 2 of a series on building production-ready infrastructure. Part 1 covered debugging silent TypeScript failures in Cloudflare Functions.

Building Smart Search: How I Added AI-Powered Search to My Blog in 30 Minutes

Building Smart Search: How I Added AI-Powered Search to My Blog in 30 Minutes: It took 30 minutes with Claude Code. Press Cmd+K right now.

Debugging in Real-Time: A Human-AI Pair Programming Session

_This is part 3 of a series on building production-ready infrastructure. Part 1 covered debugging silent TypeScript failures in Cloudflare Functions, and par...

The 100x Developer: What I Learned Building with Claude Code

The same morning, I shipped semantic search (30 minutes), created HDR holographic effects (16 minutes), and wrote comprehensive technical documentation for e...

When AI Learns to Write Like You: A Meta-Analysis

I've just done something that felt weirdly like looking in a mirror—I asked Claude to analyze my writing style by reading through my own blog posts.

OCode: Why I Built My Own Claude Code (and Why You Might Too)

OCode: Why I Built My Own Claude Code (and Why You Might Too): A few nights ago, I opened my Anthropic invoice.

Building the HTTP for Agents: A Complete Guide to Agent Infrastructure

Most teams are not ready for what is coming. Autonomous agents are not just prototypes anymore...

The Authenticity Rebellion: Resisting the AI Echo Chamber

The Authenticity Rebellion: Resisting the AI Echo Chamber: The Flood Has Arrived Auto-generated blog posts. Podcast transcripts turned into Twitter threads.

AI Detection Hysteria: When Human Creativity Gets Mislabeled

When I first noticed the flood of "This is AI-generated!" accusations on social media, I dismissed it as a passing trend.

DSPy: The End of Prompt Engineering as We Know It

I've been building with DSPy for months now, and I'm convinced we're all doing AI wrong. Not just a little wrong.

The AI Skill Mirror: Why Technical Interviews Need a Complete Rewrite

AI reveals the true skill level of its operator. Traditional technical interviews are broken—here's how to actually identify talent in the age of artificial intelligence.

How RAG Actually Works: Architecture Patterns That Scale

Deep dive into RAG architectures: chunking strategies, retrieval methods, embedding optimization, and production patterns with research-backed analysis.

Prompt Engineering Science: I Tested Temperature and Top-P on 1000 Queries

Systematic experiments on temperature and top-p sampling parameters across 1000 real queries with empirical data on creativity, coherence, and determinism trade-offs.

When the AI Starts Complimenting You Too Much: A Troubling First for ChatGPT

OpenAI recently rolled back a GPT-4 update due to sycophantic behavior. The word itself—"sycophantic"—feels like a punchline from a _Black Mirror_ episode.

AI Expectations: Managing the Hype Cycle

The Promise and the Disconnect We've all experienced the letdown: an AI product failing to meet expectations, subtly or dramatically.

Autonomous Security Operations: The Future of Enterprise Security

The End of the Traditional SOC The Security Operations Center (SOC) as we know it is living on borrowed time.

Chrome Extension for Jira Titles: A Developer's Journey

"Can you make this JIRA title clearer?" As a product manager, I've heard this question countless times.

Inside InboxArmor: Building a Smarter Email Analysis Engine

If your inbox feels like a battlefield, you're not alone. The modern email flow is a chaotic mess of promotions, business requests, events, updates, and the...

The Agentic Shift: How AI is Transforming Vertical SaaS

Remember when vertical SaaS was just about digitizing industry-specific workflows. Those days feel like ancient history.