One-line hook

A client’s S3 bucket had quietly grown past a petabyte of operational media — overwhelmingly cold, occasionally hot, never deleted — and the line item was starting to embarrass the CFO. Here’s how we cut it without a migration window, a data-loss panic, or a single ticket from a confused user.

Who this is for

The setup (vague-client framing)

Why this was harder than it looked

  1. You can’t just enable Intelligent-Tiering on a hot bucket. The monitoring fee per object is real (~$0.0025/1000 objects/month). For a bucket with hundreds of millions of small objects, the monitoring fee alone can eat the savings.
  2. Lifecycle transitions cost money per object. A transition request is a PUT-equivalent. Moving 400M objects to Glacier Instant Retrieval is a non-trivial line item by itself.
  3. Glacier Deep Archive saves the most, but retrieval is measured in hours. Customer support pulling a 3-year-old signed agreement during a dispute cannot wait 12 hours.
  4. Retrieval costs can ambush you. A misconfigured “rehydrate everything for a customer export” job can run up a five-figure bill in a single afternoon.
  5. Object metadata is the only thing telling you what’s safe to tier. And in a bucket that grew organically, that metadata is inconsistent.

The diagnostic phase

The tiering strategy we actually shipped

A tiered strategy, not a single-class strategy. Different prefixes got different lifecycle policies based on access pattern and regenerability.

Prefix family Storage class plan Rationale
Signed legal artifacts Standard → Glacier Instant Retrieval at 90 days Read rarely, but when read, it’s during a dispute and latency matters
User-uploaded media Standard → Standard-IA at 30d → Glacier Instant at 180d Strong “recent is hot” pattern
Generated exports/reports Standard → expire at 30d Regenerable from source data — delete, don’t tier
Thumbnails / derivatives Standard → expire at 7d Cheap to recreate, never worth storing cold
Internal logs / audit trails Standard → Glacier Deep Archive at 365d Almost never read; retrieval latency acceptable

Key insight: the cheapest byte is the one you don’t store. Roughly 18% of bucket size turned out to be regenerable derivatives that nobody had thought to expire.

Implementation, the boring-but-critical parts

What we’d do differently

Results

The takeaway

You don’t have a storage problem. You have an access pattern problem dressed up as a storage problem. Build the inventory pipeline, map prefixes to product features, tier by access pattern not by age, and put a guardrail on the retrieval side before you flip a single lifecycle rule. The bill takes care of itself.