Hey {{first name | there}},

Terminate the wrong instance, and you recreate it in minutes, but delete the wrong volume, and you might not notice until an audit. That gap is why storage cleanup stalls.

In today's Technical Notes:

  • Why deleting in production feels heavier than the command suggests

  • Three signals: a $125B account erased, cleanup as code, and retention rethought

  • The one question worth asking before you delete anything

See Why HubSpot Chose Mintlify for Docs

HubSpot switched to Mintlify and saw 3x faster builds with 50% fewer eng resources. Beautiful, AI-native documentation that scales with your product — no custom infrastructure required.

📰 TECHNICAL NOTES

Most teams agree that storage cleanup needs attention, and most carry a rough sense of what should be done, though the work slows the moment the implications of deleting the wrong data become concrete enough to picture.

Without a clear owner, the task keeps getting pushed, since it surfaces a few times, gets deferred to later, and then slides far enough down the list to fall off the schedule entirely.

Why deletion feels different from termination

Compute is mostly reversible, which is part of why it is easier to reason about, since terminating an instance you later need still lets you recreate it with a predictable state.

Storage offers no equivalent safety net because removing the wrong volume or bucket does not roll back, and the effects can surface much later, sometimes in an audit, sometimes in a downstream system, sometimes in a team that was never part of the original change.

None of that shows up at the moment of deletion, which is part of what makes the action feel heavier than the single command suggests.

Those delayed consequences shape how the call gets made in the moment, particularly when it is unclear who will feel the impact first.

The constraints you can’t see at the point of action.

Persistent data also reaches well beyond infrastructure, since it intersects with legal, regulatory, and contractual obligations that are not always written down where the engineer can see them.

Some data has to be retained, and some has to be removed, and both rules can apply inside the same system depending on context, which turns the work into interpreting constraints more than acting on intent.

In a regulated environment, interpretation can shift by data type, by region, and by customer contract, none of which is obvious from the resource itself.

Cost optimisation conversations tend to skip this layer, because responsibility sits with whoever runs the change, while the definition of "safe to delete" is spread across teams that rarely align early enough to make it explicit.

Moving the decision upstream

In the moment, the safe response is inspection before action, which means mapping dependencies, confirming what is still in use, and making the safe boundaries visible before anything is removed. It is rarely elegant work, and it lowers the odds of an irreversible mistake.

The more durable fix lands earlier in the lifecycle, when retention, tagging, lifecycle policies, and decommissioning get defined at the point data is created, rather than when it becomes expensive to manage. That asks engineering, product, and legal to align ahead of an urgent situation, which is harder to prioritise than the operational problem already on fire.

That work is easier to justify after an incident than before one, which is part of why it so often waits.

Storage anxiety is usually the signal that this alignment never happened, and the teams that handle it well tend to be explicit about what data is allowed to exist, how long it should persist, and who is accountable once it should no longer be there.

If you're dealing with this, or you've landed on an approach that works, reply to this email.

IN THE ECOSYSTEM

  • "Data Retention is Two Different Problems," - Alex Smolen argues retention is two separate problems with different controls, and the leverage is steering teams during design rather than after the system ships

  • Cloud Custodian - Policy-as-code for tagging, lifecycle, and cleanup, with a dry-run mode that turns careful inspection into something enforced at creation rather than a manual scramble.

  • Replit's AI production database deletion - An AI agent ran a destructive command on production during a freeze and reported rollback as impossible, sharpening the question of who owns a delete.

Go from AI overwhelmed to AI savvy professional

AI will eliminate 300 million jobs in the next 5 years.

Yours doesn't have to be one of them.

Here's how to future-proof your career:

  • Join the Superhuman AI newsletter - read by 1M+ professionals

  • Learn AI skills in 3 mins a day

  • Become the AI expert on your team

UNTIL NEXT TIME

The discomfort around deleting the wrong thing rarely resolves on its own, and it tends to ease only once someone has written down what is allowed to exist and who owns the call to remove it.

Pick one storage resource you have been avoiding and find out who actually owns the decision to delete it. That answer usually tells you more than the cleanup would.

Know an engineer wrestling with this? Share this link with them

Jubril Oyetunji
CTO, EverythingDevOps

HOW DID WE DO? OPPORTUNITIES:

Did you find this issue interesting?

Login or Subscribe to participate

Keep Reading