Notes from the on-call rotation.
Things I've learned, written down so I don't have to relearn them. Long-form, opinionated, no SEO bait.
2026
5 postsCloud cost in 2026: why your AI workloads are ten times your compute bill
Inference is the dominant line on the cloud invoice now. GPU egress is real. Reserved capacity for AI is harder to model than EC2 ever was. A worked example.
The 'ChatGPT is outdated' narrative is half right
GPT-4 is no longer SOTA on most evals. ChatGPT-the-product still wins distribution. These are different facts and the takes online keep conflating them.
Azure's AI Foundry, Copilot, and the OpenAI partnership in 2026
AI Foundry consolidated Microsoft's sprawling AI estate. The OpenAI partnership is more complicated than the press release suggests. Here is the operator's view.
AWS in 2026: Bedrock, Q, and the bet on inference-on-Graviton
Bedrock's catalog tripled, Nova landed, and AWS is betting that most inference does not need GPUs. A field report from production workloads.
Google's AI year: how Gemini and Vertex caught up — and what that means for your stack
Gemini 2 Pro and Flash, plus a serious Vertex AI Agent Builder, have moved Google from third place to a real choice. Here is where it actually fits.
2025
5 postsDeepSeek V3: The First Open Model That Made Me Rethink My Stack
DeepSeek V3 dropped in late December 2024 with frontier-class benchmarks at a fraction of the training cost. It is the first open release that genuinely shifts the cost curve.
The 2024 Freelance Contractor Market: AI Hit, but Not the Way Twitter Said
Twitter said AI would gut the freelance market. Two years in, the picture is more interesting. Some segments crashed. Others quietly tripled.
Secrets Rotation Is a Habit, Not a Project
Every team I audit treats secrets rotation as a one-time project. Six months later their secrets are stale again. Here is how to make it a habit instead.
AWS re:Invent 2024: Two Real Things and a Lot of Noise
I sat through more re:Invent keynotes than I care to admit. Most of it was repackaging. Two announcements actually matter for the work I do.
Cursor vs Copilot, Late 2024: The Honest Comparison
I have used both daily for a year. Here is what each is actually good at, what each is bad at, and which I would pay for if I had to pick one.
2024
13 postsAWS Savings Plans: The Right Way to Buy Commitment
Most teams buy Savings Plans wrong. They underbuy, overbuy, or buy the wrong type. Here is the framework I use with clients.
NAT Gateway Egress Is Eating Your AWS Bill
A client paid $14k a month in NAT Gateway data processing charges they did not know existed. Here is the math, the diagnosis, and the fix.
KubeCon 2024: The Boring Stuff Won, As It Should
Two KubeCons this year, Paris in March and Salt Lake City in November. The headline is that Kubernetes finished growing up.
RAG vs Fine-Tuning: The Adult Conversation Nobody Is Having
Half the AI projects I see are fine-tuning when they should be RAG-ing. The other half are RAG-ing when they should be fine-tuning. Here is the actual decision.
Terraform Modules: Three Patterns That Survive Contact With Reality
Terraform module design is where most platform teams accidentally build a worse Kubernetes. Here are the three patterns that actually scale.
Your Monorepo CI Is Slow Because You Cache Wrong
I see the same six caching mistakes in every monorepo CI I audit. Fix them and pipelines drop from 40 minutes to 8.
Stop Setting SLOs on Endpoints. Set Them on Journeys.
Most SLOs I see are bound to HTTP endpoints because that is what the dashboard makes easy. They are also useless. Here is how to design SLOs that mean something.
Kubernetes Upgrades Are a Discipline, Not a Project
Most teams I audit are two minor versions behind on k8s and treat each upgrade like a small migration. That is the wrong shape. Upgrades are a habit.
Claude 3.5 Sonnet Is the Coding Model I Wanted GPT-4 to Be
Anthropic shipped Claude 3.5 Sonnet in June 2024. After two months of daily use across three client projects, the verdict is in. It is the new default for code.
CrowdStrike Took Down Half the Planet. Your Runbook Should Have Caught It.
On 19 July 2024 a CrowdStrike Falcon update bricked 8.5 million Windows machines. The post-mortem is not about CrowdStrike. It is about how nobody held their vendor accountable.
Devin Was a Demo, Not a Product
Cognition launched Devin in March 2024 as the first AI software engineer. Four months in, the bench dust has settled. Here is what the autonomous agent hype actually delivered.
GPT-4o: The Multimodal Bet and What It Breaks in Your Stack
OpenAI shipped GPT-4o in May 2024. Native audio in, audio out. Half the price of GPT-4 Turbo. Here is what actually changes in production systems.
Llama 3 Is the Moment Open Weights Stopped Being a Toy
Meta dropped Llama 3 in April 2024. The 70B model is the first open-weights release I would actually deploy for a paying client.