2022 Re:Invent, Deep Dive on Amazon S3: Fundamentals of Durability, Performance, Cost, and Security

This AWS re:Invent session, presented by Oleg and Sally from the Amazon S3 team, takes a step back from feature announcements to focus on the four cross-cutting fundamental principles that underpin everything Amazon S3 does: durability, availability/performance, cost management, and security. These principles are relevant not just to S3 but to storage systems in general.

Durability: The 11 9s Promise

The session begins with durability—the core property ensuring that data written to storage can be retrieved intact. The presenters explain that S3's famous "11 9s" durability design comes from understanding hardware failure rates (AFR) and implementing robust countermeasures. Hard drives—still the workhorses of cloud storage—are mechanical devices prone to failure, making redundancy essential.

S3 uses erasure coding rather than simple replication. Objects are split into shards (data chunks plus parity shards), spread across multiple drives, racks, and availability zones. This approach allows data reconstruction from any subset of shards, providing flexibility in routing around failures. A continuous repair mechanism monitors for failures and restores redundancy before data loss occurs. The key insight is treating durability as an ongoing threat model requiring constant iteration—not a one-time implementation.

Importantly, customers can participate in this durability chain through checksums. S3 now supports multiple checksum algorithms with HTTP trailer support, allowing end-to-end data integrity verification from client to storage—what the presenters call a "durable chain of custody."

Performance: Embrace Horizontal Scaling

The performance section reveals a counter-intuitive insight: hard drive IOPS per terabyte is actually declining as capacity increases. S3 handles this through massive horizontal scaling and workload decorrelation. By spreading individual objects across many drives–each holding only tiny fragments–no single drive becomes a bottleneck, and workload spikes from many customers smooth out into predictable aggregate throughput.

The practical advice: parallelize everything. Use multiple connections/endpoints, leverage multipart uploads for large objects, and use byte-range GETs to read objects in parallel chunks. The presenters also explain S3's prefix-based scaling—prefixes have throughput limits (3,500 PUT/s, 5,500 GET/s) that S3 automatically scales, though sudden traffic spikes may see brief pushback. Adding entropy to key naming helps distribute load across prefixes immediately.

Cost: Right-Sizing Storage Classes

Sally covers S3's storage class spectrum—from S3 Standard for hot data to Glacier Deep Archive for long-term cold storage. The key optimization tools are S3 Lifecycle rules for predictable access patterns and S3 Intelligent-Tiering for unknown or changing patterns. Intelligent-Tiering automatically moves objects between access tiers (including optional archive tiers) based on actual usage, eliminating guesswork.

S3 Storage Lens provides the visibility needed for cost decisions, offering 60+ metrics with up to 15 months of historical data, CloudWatch integration, and recommendations for cost efficiency and data protection.

Security: Identity-Based and Resource-Based Controls

The final section covers access control through IAM policies (attached to identities) and bucket policies (attached to resources). Cross-account access requires both policies working together. Key features highlighted include:

Object Ownership (Bucket Owner Enforced): Ensures bucket owners automatically own all objects, eliminating complications from cross-account writes
S3 Access Points: Decompose complex bucket policies into manageable per-group policies
Block Public Access: Account-level protection overriding individual bucket settings
Access Analyzer: Dashboard for reviewing external sharing and public access

For encryption, SSE-KMS with customer-managed keys provides the most control and CloudTrail visibility. S3 Bucket Keys reduce KMS costs by up to 99% for high-volume KMS-encrypted workloads.

Key Concepts

Erasure Coding: Data redundancy technique splitting objects into data shards plus parity shards, allowing reconstruction from any subset of shards
Annualized Failure Rate (AFR): Metric for reasoning about hardware failure rates, typically measured in singular percentages for modern drives
Repair Mechanism: Continuous monitoring and restoration process that keeps redundancy ahead of hardware decay
Threat Model for Durability: Treating durability threats (hardware failure, bugs, operational errors) like security threats requiring iterative mitigation
Checksums and Chain of Custody: End-to-end data integrity verification from client through network to storage
Workload Decorrelation: Aggregating many customers' workloads smooths spikes into predictable throughput
Prefix Scaling: S3's automatic horizontal scaling based on key prefixes, with ~3.5k PUT/5.5k GET per second limits that auto-expand
S3 Intelligent-Tiering: Automatic cost optimization by moving objects between access tiers based on usage patterns
Block Public Access: Account or bucket-level setting that overrides individual permissions to prevent accidental public exposure
S3 Bucket Keys: Feature reducing KMS API calls by up to 99% for SSE-KMS encrypted objects