May 16, 2026 · Edition #83

Your AI Agent Has 6 Months to Live.

It's not the model. It never is.

Klarna pulled its AI assistant.

Air Canada paid a customer for what its chatbot promised.

McDonald's killed the drive-thru AI after it added bacon to ice cream orders.

NYC's MyCity bot got caught telling business owners how to break the law.

The list goes on. The list is longer than anyone wants to admit on LinkedIn.

This week, Sinch put a number on it.

74%.

74% of enterprises have rolled back or shut down a live AI customer-communications agent. Not "tried it and decided not to deploy." Deployed. Took it to production. Then yanked it back out.

That's a graveyard.

Let me say what most AI consultants won't: the median enterprise AI agent in 2025 had a six-month half-life. The median enterprise AI agent in 2026 isn't doing much better. And the people running these deployments mostly know this. They just don't want to admit it on a quarterly earnings call.

So let's admit it here instead.

The Real Question Has Changed

Last year, the question was: "Should we deploy an AI agent?"

This year, the question is much more useful: "Why is the median enterprise agent dead within six months?"

The first question is a hype question. The second is an engineering question. And engineering questions, unlike hype questions, have answers.

Here's mine, after building or auditing 60+ production systems:

The model never changed. The system around it was never built.

That's it. That's the whole letter.

The Six-Month Death Curve

I've watched this play out enough times to draw the curve from memory.

Month 1: Demo Glow. The launch goes great. The CEO posts on LinkedIn. The board is delighted. Someone uses the word "transformative." Metrics look good because the volume is low and the cases are easy.

Month 2: First Weird Edge Case. A customer asks something the agent wasn't built for. The agent makes something up. Support escalates it. The team chuckles, files a ticket. "We'll add a guardrail."

Month 3: The Process Gap Exposed. Turns out the human workflow the agent was supposed to automate was never actually documented. People did it differently across regions. The agent picked one version and is now in a quiet war with the other three. Nobody owns this.

Month 4: Reviewer Fatigue. The team that was supposed to review the agent's outputs has quietly stopped reviewing them. Not because they're lazy. The volume scaled and they didn't. They started thumbs-up-ing everything by default. The "human in the loop" became a rubber stamp wearing a lanyard.

Month 5: Silent Error Compounds. The agent has been making the same subtle mistake for three months and nobody noticed because the reviewers stopped reviewing in Month 4. The error is now baked into 40,000 customer interactions. A regulator notices before the company does.

Month 6: Someone Senior Pulls the Plug. A VP gets a call they did not want. The agent is killed by end of week. The official line is "pausing for review." It is not paused. It is dead. The team that built it gets quietly reorged.

I've seen this exact movie about a dozen times. Different industries. Same six acts. Same final scene.

"OK, but our agent is different. We have GPT-5 / Claude / Gemini / whatever's newest or we have the XYZ agent architecture..."

Right. Let me make this as clear as I can:

The model, or tech, is not the problem.

The model is fine. The model has been fine since GPT-4. The model and agent design is the engine. The engine works. What kills production agents is not the engine, it's that nobody built the chassis, the brakes, the steering, the dashboard, or the road.

We shipped the train before we laid the tracks. Then we blamed the train.

Let me explain.

The Six-Month Pre-Mortem

A post-mortem is what you write after your agent dies.

A pre-mortem is what you write before you deploy it, or more realistically, for most of you reading this, what you write right now about the agent you already deployed two months ago.

If you don't know what's going to kill it, it's already on the six-month clock and nobody told you.

The 74% number is a design pattern that companies keep refusing to learn.

From the 60+ systems I've worked on, here are the five things every surviving agent has, and every dead one was missing at least two of. This is the audit. Run it Monday morning.

1. The process the agent is automating is actually documented.

Not "we have a Notion page somewhere." Not "Sarah knows it." Documented. Step by step. With the edge cases. With who decides what when there's ambiguity.

If you can't write the human version of the workflow on two pages, you cannot automate it. What you'll automate instead is your team's collective hallucination of a workflow that doesn't exist in any single human's head. The agent will then execute this fictional workflow with terrifying confidence and consistency, right up until it hits something real.

You cannot automate a mess. You can only scale it.

2. ORO is assigned: Operator, Reviewer, Owner.

Letter 80 ("Your Team Wants Your AI Project to Fail") built this out. Three named humans, in writing, attached to real calendars.

Operator runs the workflow day to day. Feeds inputs. Triggers the run. Monitors for drift.
Reviewer signs off before output leaves the building. AI drafts. Human signs.
Owner is the person whose quarterly review takes the hit when something blows up. Accountability stops here.

The Owner is the one that matters most for survival. When the agent does something stupid, the Owner's phone rings first. They have the authority to pause it, the budget to fix it, and the calendar slot to review its performance every single week.

If you cannot tell me the name of the Owner without checking Slack, your agent is already in trouble. Anonymous deployments are dead deployments.

3. Reviews are about decisions, not documents.

Most "AI governance" reviews are 40-page slide decks read by nobody, sitting on a SharePoint nobody opens. That's not oversight. That's oversight cosplay.

A real review is a 30-minute weekly meeting that produces exactly three things:

The decisions we made this week
The things we're watching
The things we'd kill the agent over

That's it. If your review doesn't produce decisions, it isn't a review. It's a calendar event.

4. The agent's context is curated, not stuffed.

The single most common mistake I see: teams dump their entire knowledge base into the agent's context window and call it "RAG."

It is not RAG. It is digital hoarding with extra steps.

A good agent has a small, specific, curated set of context. Not "everything we know". Just "what's needed for this specific task." Pruning is the work. Stuffing is laziness wearing the costume of thoroughness.

If you can't tell me what's in your agent's context and why each piece is there, your agent is operating in fog. The fog will eventually produce the wrong answer in the perfect tone.

5. Verification doesn't depend on Sarah on Friday afternoon.

This is the one nobody wants to say out loud, so I'll say it:

Most "human-in-the-loop" systems are actually "human-in-the-way-when-they-have-time" systems.

If your verification gate is "Sarah reviews the agent's outputs at the end of every week," congratulations… You have a verification gate when Sarah isn't sick, on vacation, in a meeting, dealing with her actual job, or, eventually, fed up enough to just thumbs-up everything because it's 4:50 PM on Friday and her kid has soccer practice.

Real verification doesn't depend on Sarah's mood or her calendar. It depends on a system. That can be a deterministic check (does the output match a schema?), a second model checking the first (cheap and surprisingly effective), or a sampled review with a clear SLA and an owner. But "Sarah will catch it" is not a verification strategy. It is hope, written into a Confluence page.

The Math, Brutally

If your agent is missing one of those five things, you're at risk.

Missing two? You're on the six-month clock. The countdown started the day you went live.

Missing three or more, and I've audited many that are, your agent is already dead. The team just hasn't realized it yet. There's a tombstone in the graveyard with your project name on it. Only the date is missing.

That's not me being dramatic. That's the pattern across 60+ systems. The ones that survived year one had all five. The ones that didn't, didn't.

The Uncomfortable Truth

The 74% didn't die because AI doesn't work.

AI works.

They died because companies shipped the train before laying the tracks. They confused velocity with progress. They confused a demo with a deployment. They confused "we have an agent in production" with "we have a production-grade agent."

And then they blamed the train.

I want you to notice something. Every one of the five items on the audit is boring.

Documented process. Named ORO. Real review meetings. Curated context. Verification that isn't Sarah-shaped.

None of this will trend on LinkedIn. None of this is what the "AI transformation" decks are selling. None of this is exciting.

Which is exactly why most companies skip it. And exactly why their agents die.

The boring stuff is the moat. The exciting stuff is the demo.

What To Do Monday Morning

If you have an agent in production right now, here is your 9 AM Monday assignment:

Open a doc. Title it: "Six-Month Pre-Mortem: [Agent Name]."
Score the agent honestly against the five items above. Yes / No. No "kind of." No "we're working on it."
For every "No," write one sentence: what would it take to turn this into a Yes?
Send the doc to exactly one person: the Owner.

If there is no Owner, you just found your first "No."

That's the audit. It takes about 45 minutes if you're honest with yourself, and three weeks if you're not.

The agents that survive 2026 will be the ones whose teams ran this audit before the universe ran it for them.

One Last Thing

This is Letter 83.

If you've been reading the past few months, you've seen the pieces of this argument scattered across the series: the intensification piece (73), the oversight cosplay piece (77), the "you cannot automate a mess" piece (78), the context engineering piece (79), the ORO piece (80).

This letter pulls them into one place.

Most AI advice is a tool. This is an Operating System.

Tools come and go. The model you're using today won't be the model you're using in eighteen months. The OS stays. The five questions above will still be the five questions in 2028. They were the five questions in 1998 when we called it "workflow automation." They'll be the five questions in 2032 when we call it something else.

Models change. Systems don't.

The technology is fine. The models are fine. The tools are fine. They've all been fine for eighteen months, and they will keep being fine.

What dies in Month 6 is not the agent.

What dies in Month 6 is the illusion that you could skip the boring work and still get the outcome.

You cannot.

The boring work IS the outcome.

Process first. Agent second. Owner named. Context curated. Verification real. Reviews about decisions.

Five things. Six months. Same model. Different ending.

AI is only as good as the human operating it. But the human is only as good as the system designed around them. And the system is only as good as the boring work you were willing to do before the agent was ever switched on :)

Build the tracks. Then run the train.

Have a great weekend.

— Charafeddine (CM)