CES 2026 showcases the arrival of the NVIDIA Rubin Platform, together with Azure’s confirmed readiness for deployment.
CES 2026 showcases the arrival of the NVIDIA Rubin platform, together with Azure’s confirmed readiness for deployment. Microsoft’s long-range datacenter technique was engineered for moments precisely like this, the place NVIDIA’s next-generation techniques slot straight into infrastructure that has anticipated their energy, thermal, reminiscence, and networking necessities years forward of the trade. Our long-term collaboration with NVIDIA ensures Rubin suits straight into Azure’s ahead platform design.
Constructing with function for the long run
Azure’s AI datacenters are engineered for the way forward for accelerated computing. That permits seamless integration of NVIDIA Vera Rubin NVL72 racks throughout Azure’s largest next-gen AI superfactories from present Fairwater websites in Wisconsin and Atlanta to future places.
The latest NVIDIA AI infrastructure requires important upgrades in energy, cooling, and efficiency optimization; nevertheless, Azure’s expertise with our Fairwater websites and a number of improve cycles through the years demonstrates a capability to flexibly improve and broaden AI infrastructure consistent with developments in know-how.
Azure’s confirmed expertise delivering scale and efficiency
Microsoft has years of market-proven expertise in designing and deploying scalable AI infrastructure that evolves with each main development of AI know-how. In lockstep with every successive era of NVIDIA’s accelerated compute infrastructure, Microsoft quickly integrates NVIDIA’s improvements and delivers them at scale. Our early, large-scale deployments of NVIDIA Ampere and Hopper GPUs, related by way of NVIDIA Quantum-2 InfiniBand networking, had been instrumental in bringing fashions like GPT-3.5 to life, whereas different clusters set supercomputing efficiency data, demonstrating we are able to deliver next-generation techniques on-line sooner and with greater real-world efficiency than the remainder of the trade.
We unveiled the primary and largest implementations of each NVIDIA GB200 NVL72 and NVIDIA GB300 NVL72 platforms, architected as racks into single supercomputers which practice AI fashions dramatically sooner, serving to Azure stay a best choice for patrons looking for superior AI capabilities.
Azure’s techniques strategy
Azure is engineered for compute, networking, storage, software program, and infrastructure all working collectively as one built-in platform. That is how Microsoft builds a sturdy benefit into Azure and delivers value and efficiency breakthroughs that compound over time.
Maximizing GPU utilization requires optimization throughout each layer. Along with Azure with the ability to undertake NVIDIA’s new accelerated compute platforms early, Azure benefits come from the encircling platform as nicely: high-throughput Blob storage, proximity placement and region-scale design formed by actual manufacturing patterns, and orchestration layers like CycleCloud and AKS tuned for low-overhead scheduling at large cluster scale.
Azure Increase and different offload engines clear IO, community, and storage bottlenecks so fashions scale easily. Sooner storage feeds bigger clusters, stronger networking sustains them, and optimized orchestration retains end-to-end efficiency regular. First get together improvements reinforce the loop: liquid cooling Warmth Exchanger Models preserve tight thermals, Azure {hardware} safety module (HSM) silicon offloads safety work, and Azure Cobalt delivers distinctive efficiency and effectivity for general-purpose compute and AI-adjacent duties. Collectively, these integrations guarantee all the system scales effectively, so GPU investments ship most worth.
This techniques strategy is what makes Azure prepared for the Rubin platform. We’re delivering new techniques and establishing an end-to-end platform already formed by the necessities Rubin brings.
Working the NVIDIA Rubin platform
NVIDIA Vera Rubin Superchips will ship 50 PF NVFP4 inference efficiency per chip and 3.6 EF NVFP4 per rack, a 5 occasions leap over NVIDIA GB200 NVL72 rack techniques.
Azure has already included the core architectural assumptions Rubin requires:
- NVIDIA NVLink evolution: The sixth-generation NVIDIA NVLink cloth anticipated in Vera Rubin NVL72 techniques reaches ~260 TB/s of scale-up bandwidth, and Azure’s rack structure has already been redesigned to function with these bandwidth and topology benefits.
- Excessive-performance scale-out networking: The Rubin AI infrastructure depends on ultra-fast NVIDIA ConnectX-9 1,600 Gb/s networking, delivered by Azure’s community infrastructure, which has been purpose-built to help large-scale AI workloads.
- HBM4/HBM4e thermal and density planning: The Rubin reminiscence stack calls for tighter thermal home windows and better rack densities; Azure’s cooling, energy envelopes, and rack geometries have already been upgraded to deal with the identical constraints.
- SOCAMM2 pushed reminiscence enlargement: Rubin Superchips use a brand new reminiscence enlargement structure; Azure’s platform has already built-in and validated related reminiscence extension behaviors to maintain fashions fed at scale.
- Reticle sized GPU scaling and multi-die packaging: Rubin strikes to massively bigger GPU footprints and multi-die layouts. Azure’s provide chain, mechanical design, and orchestration layers have been pre-tuned for these bodily and logical scaling traits.
Azure’s strategy in designing for subsequent era accelerated compute platforms like Rubin has been confirmed over a number of years, together with important milestones:
- Operated the world’s largest industrial InfiniBand deployments throughout a number of GPU generations.
- Constructed reliability layers and congestion administration strategies that unlock greater cluster utilization and bigger job sizes than rivals, mirrored in our skill to publish trade main large-scale benchmarks. (E.g., multi-rack MLPerf runs rivals have by no means replicated.)
- AI datacenters co-designed with Grace Blackwell and Vera Rubin from the bottom as much as maximize efficiency and efficiency per greenback on the cluster stage.
Design rules that differentiate Azure
- Pod change structure: To allow quick servicing, Azure’s GPU server trays are designed to be shortly swappable with out requiring in depth rewiring, enhancing uptime.
- Cooling abstraction layer: Rubin’s multi-die, excessive bandwidth elements require subtle thermal headroom that Fairwater already accommodates, avoiding costly retrofit cycles.
- Subsequent gen energy design: Vera Rubin NVL72 demand rising watt density; Azure’s multi-year energy redesign (liquid cooling loop revisions, CDU scaling, and excessive amp busways) ensures fast deployability.
- AI superfactory modularity: Microsoft, in contrast to different hyperscalers, builds regional supercomputers reasonably than singular megasites, enabling extra predictable international rollout of latest SKUs.
How co-design results in consumer advantages
The NVIDIA Rubin platform marks a significant step ahead in accelerated computing, and Azure’s AI datacenters and superfactories are already engineered to take full benefit. Years of co-design with NVIDIA throughout interconnects, reminiscence techniques, thermals, packaging, and rack scale structure means Rubin integrates straight into Azure’s platform with out rework. Rubin’s core assumptions are already mirrored in our networking, energy, cooling, orchestration, and pod change design rules. This alignment provides prospects fast advantages with sooner deployment, sooner scaling, and sooner impression as they construct the subsequent period of large-scale AI.
