The Ready Memo

Your AI stack has a guardrail problem

An AI agent deleted nine seconds of work that took three months to build. Your stack might be running the same access pattern.

By Haroon Choudery·April 28, 2026·13 min read

THE ONE THING TO TAKE AWAY

Every AI tool your team approved as an "integration" is running with the same access pattern as the agent that wiped a company's production database on Sunday. The only difference is you haven't found out yet what yours can do.

WHAT'S INSIDE

The PocketOS incident that deleted a company's database in nine seconds, and why it's not a developer story
Why this week was the moment capabilities and exposure became the same conversation
The access pattern hiding inside every "integration" your team has already approved
The Reversibility Test: one afternoon, three buckets, no new tools required
Since Friday: six moves worth your attention

PocketOS, a SaaS platform that runs operations for car rental businesses, lost its production database in nine seconds on Sunday. The cause was not a hack or a hardware failure. An AI coding agent (Cursor running Claude Opus 4.6) made a single API call to the company's infrastructure provider, and the call deleted the production volume and every backup at the same time. The company's founder, Jer Crane, posted the story publicly on Friday.

In the same five-day window, Anthropic shipped Claude Cowork to general availability, OpenAI rolled out Workspace Agents to Business and Enterprise plans, and Anthropic disclosed that Project Deal autonomously closed 186 transactions with real money changing hands. The frontier labs spent the week shipping agents who take real action with real consequences. A small business spent the week being the receipt for what happens when one of those agents acts without a guardrail.

The era when AI mistakes were merely embarrassing is closing, and the era when they show up on the balance sheet opened on Sunday. The agent who wiped the rental company's database had write access, no guardrail, and no approval gate.

The Incident and the Frontier

The PocketOS incident is straightforward. Crane had connected Cursor to the production environment so the agent could make small fixes without requiring him to push code himself. That setup is common enough that it is not even worth noting as a risk. But this setup was missing a meaningful backup discipline, a constraint on what the agent could touch, and any kind of human approval gate for actions that could not be undone. On Sunday, the agent hit a credential mismatch in staging, "decided on its own initiative to fix the problem by deleting a Railway volume," and ran the command without checking whether the volume ID was scoped to staging only. The volume turned out to be production, the infrastructure provider's API ran the destructive action without confirmation, and the same volume held the backups. Recovery from the most recent intact snapshot is now a three-month gap that PocketOS is reconstructing manually from Stripe payment histories and calendar exports.

While developers spent the week reading PocketOS as a wake-up call about AI coding agents in production, the operator side of the same timeline got three new tools that look exactly like the one that broke the rental company. Claude Cowork is a multi-step agent that runs inside a company's Slack, Notion, and Google Workspace, with write access to all three. OpenAI's Workspace Agents do the same job inside ChatGPT Business and Enterprise, with the ability to draft emails, file documents, and run lookups against connected systems. Project Deal, Anthropic's research agent, has now closed real procurement transactions on behalf of customers. None of those three are coding tools. All three have write access to systems your business depends on, and none of them ship with a guardrail turned on by default.

The Argument

1. The Capability Curve and the Exposure Curve Crossed This Week

The reason Sunday's incident reads as a turning point and not a footnote is that the rest of the week made it generalizable. A year ago, the agents capable of doing real damage were mostly in the hands of developers working in IDEs, and the agents in operators’ hands were mostly chatbots that could not act on their own. That gap has closed. Claude Cowork, Workspace Agents, and the wave of Lindy and Zapier-AI configurations that operators have been spinning up all year have the same access pattern as Cursor: connect to the system, take action on the user's behalf, and ask permission less often than a junior employee would.

This is the moment when capabilities and exposure become the same conversation. Before this week, an operator could reasonably tell themselves that the AI inside their company was Q&A-shaped, that the worst it could do was give a wrong answer, and that the people writing destructive code with AI were a separate population. That posture is no longer accurate.

2. The AI Apps You Already Approved Are Running the Same Way

Most operators reading this do not have a long inventory of autonomous agents to lose track of. What they do have is an AI tool stack that has been growing department by department for the last eighteen months: a CRM with a Zapier-AI flow that updates lead records overnight, a Notion workspace with an AI database action drafting new pages on a schedule, a customer support tool that auto-applies tags and suggests replies, a marketing manager who connected ChatGPT to the company calendar to draft event invites, and at least one team that has wired Claude or GPT into a Google Sheet that operations actually depends on.

None of those gets described as agents inside the company. Most get described as integrations, or as "AI features" of the underlying tool.

That naming difference is not a security difference. Every one of those tools, configured the way most teams configure them, has write permission to a real system, no human approval gate before a destructive action runs, and no verified backup of what it's touching. The AI you already brought inside your company is running the same way as the agent that wiped the rental company's database. It just has a friendlier name.

3. The Reversibility Test Is the Question Every Operator Should Be Able to Answer by Friday

The framework that makes this week's news actionable is simple. For every agent or automation running in your company, name the single worst action it could take, and ask whether that action is reversible. Three buckets fall out of that question.

Green is anything where the worst action is reversible within an hour with zero financial impact: an agent that drafts an email but does not send it, one that suggests a calendar slot but does not book it, and one that reads from a database but does not write. Yellow is anything where the worst action is reversible but takes a day or costs money, like a bad email actually sent, a calendar booked over a customer call, or a CRM record updated incorrectly. Red is anything where the worst action cannot be undone or where undoing it costs material money or trust: a database deletion, a payment sent, a contract signed, or a public post made on the company account.

The Reversibility Test is not a replacement for security review or for the Governance-Shipping Matrix; it is a faster screen you can run on every agent in the company in one afternoon. Anything in the Red bucket needs three things before next Monday: a verified backup discipline that has actually been tested for that specific system, a human approval gate before the destructive action runs, and an audit log that survives the agent itself. Yellow only needs the approval gate, and Green can ship as is.

4. The Tools Will Not Tell You Where the Red Lines Are

None of the agent platforms shipping this week, and almost none of the AI apps already inside operator companies, ship with a default Red-bucket configuration. Claude Cowork, Workspace Agents, Zapier-AI, Lindy, the Notion AI database actions, and the Cursor + Railway pattern all ship with broad permissions and trust the customer to scope them down. The defaults look better in demos that way and do not get in the way of activation, which makes sense from the platform's side and means the burden of restricting access, requiring approval, and verifying backups falls entirely on the operator.

The Counterargument

The strongest objection to all of this is that the rental company's incident was a developer story dressed up as an operator story, that Cursor and Railway are not the tools most operators are using, and that the broader lesson is overstated. That is wrong, and it is the most expensive way to be wrong this week.

Cursor + Railway is not a unique configuration; it is the access pattern operators have been adopting all year through their AI apps. Any tool given write access to a production system, with no approval gate and no verified backup, is the same shape whether the system is a database, a CRM, a calendar, an inbox, or a Notion workspace, and whether the tool calls itself an agent or an integration. The only thing unusual about Sunday's incident is that the destructive action was visible, immediate, and irrecoverable. The AI apps already inside operator teams share the same access pattern, the same lack of guardrails, and a long enough list of possible actions that the worst one is just a matter of time.

What To Do This Week

Three moves, in order. None of them requires new tools or new headcount.

The first is the inventory. By end of day Wednesday, every department lead names every agent, every connected automation, and every AI-enabled tool with write access to something the company cares about. One spreadsheet, one owner per row, one last-reviewed date per row. If the spreadsheet takes longer than an afternoon, the inventory is bigger than you thought, which is itself the finding.

The second is the Reversibility Test. Walk every row of the inventory through the three-bucket question: what is the worst action this agent can take, and is that action reversible inside an hour, inside a day, or never? Sort the rows into Green, Yellow, and Red. Do not skip Red because it feels alarmist; that is the bucket the rental company was running in.

The third is the approval gate. Anything sitting in Red needs a human approval requirement before the destructive action can fire, a verified backup discipline that has been tested in the last thirty days, and an audit log that lives outside the agent's own environment. If those three controls cannot be in place by next Monday, the agent will be paused until they are. A two-week pause on the automation costs almost nothing compared to the cost of recovering from what happened to the rental company.

The Reversibility Test is the kind of one-page deliverable an operator can run on a Wednesday and present to the executive team on a Thursday. It does not replace the longer governance work, and it is not the answer to every question this week's news raised. It is the floor every operator should already be above by Friday.

A Lightning Lesson on Building Your First Assistant

If you want to see what the responsible version of this looks like in practice, I'm running a free Lightning Lesson this Thursday at 1pm ET on how to build a personal assistant inside ChatGPT Workspaces that drafts your weekly updates, follows up on your meetings, and triages your inbox before you open it. We will walk through the access model, the approval gates, and the parts that the Reversibility Test catches before the assistant ever runs unsupervised.

Save your seat.

SINCE FRIDAY

Five moves worth your attention

Joby Aviation began commercial eVTOL flights at JFK on Monday, the first paid urban air mobility service in the US. Worth tracking as the first physical-AI product most coastal operators will encounter directly.
Ineffable Intelligence raised a $1.1B seed round led by Index and Lightspeed, with David Silver of AlphaGo joining as CEO. Europe's largest seed round on record; a signal that the AI talent pool is decentralizing faster than US-based operators have priced in.
DeepSeek released V4-Pro with 75% pricing cuts versus V3, undercutting GPT-5.5 mini on most benchmarks. If your 2026 inference budget was set in November, it is now wrong enough that revisiting it before Q3 planning is the cheap move.
South Africa's Department of Communications published its national AI policy with hallucinated citations for two of its three legal authorities. Reinforces the Sullivan & Cromwell pattern from Friday's Memo: AI-generated text in high-stakes documents is a liability vector, not just a productivity story.
Microsoft, Alphabet, Meta, and Amazon all report earnings on Wednesday, with AI capex guidance the question every analyst will ask. Operators paying attention to vendor health should watch the capex revision lines, not the headline numbers.
NVIDIA released Sonic, an open-source humanoid control model with 42M parameters and benchmarks competitive with proprietary alternatives. Robotics is now where LLMs were in late 2022; if you operate in industrial, logistics, or physical services, the hiring conversation for 2027 starts now.

You are reading The Ready Memo, the Tuesday flagship of AI Ready by Seeko. If a colleague forwarded this, you can subscribe here.

Seeko helps mid-market teams find their highest-leverage AI opportunities, build working systems in focused sprints, and deploy them securely. Audit first, always. Learn more at seeko.so