Twenty years of Amazon S3 and constructing what’s subsequent


Voiced by Polly

Twenty years in the past at present, on March 14, 2006, Amazon Easy Storage Service (Amazon S3) quietly launched with a modest one-paragraph announcement on the What’s New web page:

Amazon S3 is storage for the Web. It’s designed to make web-scale computing simpler for builders. Amazon S3 offers a easy net companies interface that can be utilized to retailer and retrieve any quantity of information, at any time, from wherever on the net. It offers any developer entry to the identical extremely scalable, dependable, quick, cheap information storage infrastructure that Amazon makes use of to run its personal international community of internet sites.

Even Jeff Barr’s weblog publish was just a few paragraphs, written earlier than catching a airplane to a developer occasion in California. No code examples. No demo. Very low fanfare. No person knew on the time that this launch would form our complete trade.

The early days: Constructing blocks that simply work

At its core, S3 launched two simple primitives: PUT to retailer an object and GET to retrieve it later. However the true innovation was the philosophy behind it: create constructing blocks that deal with the undifferentiated heavy lifting, which freed builders to give attention to higher-level work.

From day one, S3 was guided by 5 fundamentals that stay unchanged at present.

Safety means your information is protected by default. Sturdiness is designed for 11 nines (99.999999999%), and we function S3 to be lossless. Availability is designed into each layer, with the idea that failure is at all times current and should be dealt with. Efficiency is optimized to retailer just about any quantity of information with out degradation. Elasticity means the system routinely grows and shrinks as you add and take away information, with no guide intervention required.

After we get these items proper, the service turns into so simple that almost all of you by no means have to consider how complicated these ideas are.

S3 at present: Scale past creativeness

All through 20 years, S3 has remained dedicated to its core fundamentals even because it’s grown to a scale that’s exhausting to understand.

When S3 first launched, it supplied roughly one petabyte of complete storage capability throughout about 400 storage nodes in 15 racks spanning three information facilities, with 15 Gbps of complete bandwidth. We designed the system to retailer tens of billions of objects, with a most object dimension of 5 GB. The preliminary value was 15 cents per gigabyte.

S3 key metrics illustration

In the present day, S3 shops greater than 500 trillion objects and serves greater than 200 million requests per second globally throughout a whole bunch of exabytes of information in 123 Availability Zones in 39 AWS Areas, for hundreds of thousands of shoppers. The most object dimension has grown from 5 GB to 50 TB, a ten,000 fold improve. For those who stacked the entire tens of hundreds of thousands S3 exhausting drives on high of one another, they’d attain the Worldwide House Station and virtually again.

Whilst S3 has grown to assist this unbelievable scale, the value you pay has dropped. In the present day, AWS costs barely over 2 cents per gigabyte. That’s a value discount of roughly 85% since launch in 2006. In parallel, we’ve continued to introduce methods to additional optimize storage spend with storage tiers. For instance, our clients have collectively saved greater than $6 billion in storage prices through the use of Amazon S3 Clever-Tiering as in comparison with Amazon S3 Customary.

Over the previous twenty years, the S3 API has been adopted and used as a reference level throughout the storage trade. A number of distributors now provide S3 appropriate storage instruments and programs, implementing the identical API patterns and conventions. This implies expertise and instruments developed for S3 typically switch to different storage programs, making the broader storage panorama extra accessible.

Regardless of all of this progress and trade adoption, maybe probably the most exceptional achievement is that this: the code you wrote for S3 in 2006 nonetheless works at present, unchanged. Your information went via 20 years of innovation and technical advances. We migrated the infrastructure via a number of generations of disks and storage programs. All of the code to deal with a request has been rewritten. However the information you saved 20 years in the past continues to be obtainable at present, and we’ve maintained full API backward compatibility. That’s our dedication to delivering a service that frequently “simply works.”

The engineering behind the dimensions

What makes S3 potential at this scale? Steady innovation in engineering.

A lot of what follows is drawn from a dialog between Mai-Lan Tomsen Bukovec, VP of Information and Analytics at AWS, and Gergely Orosz of The Pragmatic Engineer. The in-depth interview goes additional into the technical particulars for many who wish to go deeper. Within the following paragraphs, I share some examples:

On the coronary heart of S3 sturdiness is a system of microservices that repeatedly examine each single byte throughout your entire fleet. These auditor companies study information and routinely set off restore programs the second they detect indicators of degradation. S3 is designed to be lossless: the 11 nines design aim displays how the replication issue and re-replication fleet are sized, however the system is constructed in order that objects aren’t misplaced.

S3 engineers use formal strategies and automatic reasoning in manufacturing to mathematically show correctness. When engineers examine in code to the index subsystem, automated proofs confirm that consistency hasn’t regressed. This similar method proves correctness in cross-Area replication or for entry insurance policies.

Over the previous 8 years, AWS has been progressively rewriting performance-critical code within the S3 request path in Rust. Blob motion and disk storage have been rewritten, and work is actively ongoing throughout different elements. Past uncooked efficiency, Rust’s sort system and reminiscence security ensures remove complete courses of bugs at compile time. That is an important property when working at S3 scale and correctness necessities.

S3 is constructed on a design philosophy: “Scale is to your benefit.” Engineers design programs in order that elevated scale improves attributes for all customers. The bigger S3 will get, the extra de-correlated workloads grow to be, which improves reliability for everybody.

Wanting ahead

The imaginative and prescient for S3 extends past being a storage service to changing into the common basis for all information and AI workloads. Our imaginative and prescient is easy: you retailer any sort of information one time in S3, and you’re employed with it instantly, with out shifting information between specialised programs. This method reduces prices, eliminates complexity, and removes the necessity for a number of copies of the identical information.

Listed here are a number of standout launches from latest years:

  • S3 Tables – Totally managed Apache Iceberg tables with automated upkeep that optimize question effectivity and scale back storage value over time.
  • S3 Vectors – Native vector storage for semantic search and RAG, supporting as much as 2 billion vectors per index with sub-100ms question latency. In solely 5 months (July–December 2025), you created greater than 250,000 indices, ingested greater than 40 billion vectors, and carried out greater than 1 billion queries.
  • S3 Metadata – Centralized metadata for fast information discovery, eradicating the necessity to recursively listing giant buckets for cataloging and considerably decreasing time-to-insight for big information lakes.

Every of those capabilities operates at S3 value construction. You’ll be able to deal with a number of information varieties that historically required costly databases or specialised programs however are actually economically possible at scale.

From 1 petabyte to a whole bunch of exabytes. From 15 cents to 2 cents per gigabyte. From easy object storage to the inspiration for AI and analytics. By means of all of it, our 5 fundamentals–safety, sturdiness, availability, efficiency, and elasticity–stay unchanged, and your code from 2006 nonetheless works at present.

Right here’s to the following 20 years of innovation on Amazon S3.

— seb

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles