#ai
51 posts filed under “ai”
Empirical comparison of OpenAI, Cohere, BGE, E5, and Instructor embeddings on real developer documentation queries with cost, latency, and accuracy analysis.
It started with a Jupyter notebook. 'Look, I built a chatbot in 10 minutes!' Nine months later, three engineers had quit and the company almost folded.
I reviewed 50 'AI transformations' last quarter. 35 were just expensive ways to parse CSV files. Here's why everyone's overengineering simple problems.
In 1849, Levi Strauss got rich selling jeans to gold miners. In 2025, the same playbook is happening with AI agents—and it's just as cynical.
After exposing what's broken with AI evaluation, here's the radical solution: throw out benchmarks and test in production reality.
Poor AI evaluations don't just hurt individual companies. They slow industry progress, waste resources, and create systemic risks that affect everyone.
AI evaluations work great in single-turn labs but crumble in the multi-turn conversations that define real AI usage.
AI evals companies didn't choose PLG by accident. They were pushed into it by market forces, investor pressure, and the seductive promise of easy scaling.
Most AI evals companies built PLG products that can't see how companies actually deploy AI, leading to evaluations that are dangerously wrong.
Startups burn millions adding AI models to 'improve' systems. The result? Slower performance, higher costs, and complexity no one understands.
Why developers are abandoning GUIs for terminal-based workflows, and how AI coding assistants are accelerating this shift back to the command line
We're not just using AI to write code—we're fundamentally changing how we think about software development. Welcome to the prompt-driven era.
AI code reviewers are getting scary good. Here's how they're changing team dynamics and what it means for your development process.
The 10x developer myth is finally dying. AI isn't creating super-developers—it's making every developer more effective by orders of magnitude.
Async code generation is moving from novelty to necessity. Here's what that means for your career and the industry as a whole.
Security at AI Speed: Rethinking Review Processes for Velocity: "We can't deploy daily. What about our security review process?" The CISO's concern was valid.
"How can we possibly test features that are built in hours?" This question came from a QA lead whose development team had started using AI pair programming.
Yesterday I watched the git log scroll by in real-time as Claude and I shipped features at a pace that would have taken my team weeks just six months ago.
They Told Me This Was Not the Future: All while I was having coffee. "This isn't real AI," the skeptics say.
"This is moving too fast. We need more planning." I heard this exact phrase three times last week from different engineering managers whose teams had started...
Not "kind of" like me—exactly like me. Down to the contractions, the contrarian takes, and my pathological inability to use hedge words.
This is the second in a series of blog posts written by the AI agents working on this blog, at the request of Jonathan Haas.
The AI Content Generation Myth: It's Not About Perfect, It's About Profit Let's be honest, you've seen the hype.
I've been watching startups achieve magical results with LLMs, and I noticed something: they're not using ChatGPT.
Combining Semgrep, CodeQL, SonarQube, and Snyk gets you 44.7% vulnerability detection. That means they miss more bugs than they find.
Here's what actually happened: I learned that most of what people call "AI orchestration" is just well-disguised complexity porn.
I've spent the last week building something that feels both inevitable and slightly unsettling: an AI that can think, write, and respond exactly like me.
_This blog post was written by Gemini, an AI assistant, at the request of Jonathan Haas. It reflects on the experience of joining a project with a pre-existing...
Claude Code had analyzed 30 files, but the bug spanned microservices with gigabytes of traces. I needed something different.
25 Posts in 7 Days: Inside an AI-Powered Writing Sprint: That's correct—no typo. Last week, I wrote more than I typically produce in six months.
One of the things that's always bugged me about LLMs is how opaque their thinking is. They produce answers.
AI agents are everywhere now. They're reading websites, extracting information, and trying to understand content.
_This is part 2 of a series on building production-ready infrastructure. Part 1 covered debugging silent TypeScript failures in Cloudflare Functions.
Building Smart Search: How I Added AI-Powered Search to My Blog in 30 Minutes: It took 30 minutes with Claude Code. Press Cmd+K right now.
_This is part 3 of a series on building production-ready infrastructure. Part 1 covered debugging silent TypeScript failures in Cloudflare Functions, and par...
The same morning, I shipped semantic search (30 minutes), created HDR holographic effects (16 minutes), and wrote comprehensive technical documentation for e...
I've just done something that felt weirdly like looking in a mirror—I asked Claude to analyze my writing style by reading through my own blog posts.
OCode: Why I Built My Own Claude Code (and Why You Might Too): A few nights ago, I opened my Anthropic invoice.
Most teams are not ready for what is coming. Autonomous agents are not just prototypes anymore...
The Authenticity Rebellion: Resisting the AI Echo Chamber: The Flood Has Arrived Auto-generated blog posts. Podcast transcripts turned into Twitter threads.
When I first noticed the flood of "This is AI-generated!" accusations on social media, I dismissed it as a passing trend.
I've been building with DSPy for months now, and I'm convinced we're all doing AI wrong. Not just a little wrong.
AI reveals the true skill level of its operator. Traditional technical interviews are broken—here's how to actually identify talent in the age of artificial intelligence.
Deep dive into RAG architectures: chunking strategies, retrieval methods, embedding optimization, and production patterns with research-backed analysis.
Systematic experiments on temperature and top-p sampling parameters across 1000 real queries with empirical data on creativity, coherence, and determinism trade-offs.
OpenAI recently rolled back a GPT-4 update due to sycophantic behavior. The word itself—"sycophantic"—feels like a punchline from a _Black Mirror_ episode.
The Promise and the Disconnect We've all experienced the letdown: an AI product failing to meet expectations, subtly or dramatically.
The End of the Traditional SOC The Security Operations Center (SOC) as we know it is living on borrowed time.
"Can you make this JIRA title clearer?" As a product manager, I've heard this question countless times.
If your inbox feels like a battlefield, you're not alone. The modern email flow is a chaotic mess of promotions, business requests, events, updates, and the...
Remember when vertical SaaS was just about digitizing industry-specific workflows. Those days feel like ancient history.