Powering the Way forward for AI: Cisco’s Breakthroughs in Safe AI Networking with NVIDIA


AI calls for a elementary rethinking of how we design, construct, and safe information facilities. Organizations are transferring previous the experimentation part and are deploying huge AI clusters that require unprecedented community bandwidth, energy, and safety controls. Constructing these giga-scale environments isn’t nearly including extra GPUs. It requires a holistic, deeply built-in structure that ensures each part—from the silicon to techniques to software program and the working mannequin—works in good concord.

Cisco is delivering the foundational infrastructure wanted to make this a actuality. By combining our networking experience with superior silicon, techniques, optics, software program, working fashions, and safety improvements, we’re offering enterprises, neoclouds, and sovereign cloud suppliers with the instruments they should deploy AI securely at scale. Cisco stands out as the one vendor able to delivering a real turnkey answer, seamlessly tailor-made to fulfill the calls for of consumers at each scale.

Fully Integrated Stack: Nexus Dashboard for on premises and Hyperfabric for cloud management. Includes Silicon, Systems, Optics, Software, and Security and Observability.
Determine 1: Unified administration aircraft with Cisco Nexus One

Via our partnership with NVIDIA, we’re pushing the boundaries of what AI networks can obtain, specializing in three important pillars: infrastructure, networking, and safety.

Constructed for AI factories

The transition to giga-scale AI requires {hardware} able to dealing with huge information throughput with minimal latency. Constructing on the enlargement of the lately introduced N9300 Silicon One G300 scale-out and 51.2T P200 scale-across techniques, Cisco is elevating the bar to fulfill these intense calls for.

We’re thrilled to introduce the brand new Cisco N9100 Sequence Change, N9164F-NS6, powered by NVIDIA Spectrum-6 silicon. This 102.4T change delivers a large leap in information heart capability to help next-generation safe AI factories. We’re additionally making the N9100 switches out there with NVIDIA Spectrum-4 silicon, giving enterprises, neoclouds, and sovereign cloud suppliers versatile, high-performance deployment choices.

We consider within the energy of silicon variety. This technique lets you select the precise expertise that matches your particular efficiency and operational wants. To maintain implementation easy, we guarantee full reference structure compliance. This streamlines your deployment and ensures easy integration into advanced information heart environments.

Cisco retains elevating the efficiency ceiling, making superior AI infrastructure sooner and simpler to deploy. We launched the N9100 switches, powered by NVIDIA Spectrum-6 Ethernet silicon, to deal with the immense scale required by next-generation safe AI factories. This 100% liquid-cooled 102.4T change represents a serious leap ahead in AI information heart capability.

As well as, the N9100 change, powered by NVIDIA Spectrum-4 silicon, can also be now out there, additional increasing deployment choices for enterprises, neoclouds, and sovereign cloud suppliers.

102.4T Cisco N9100 Scale Out Fabric, powered by NVIDIA Spectrum-6 Ethernet. Features N9164F-NS6, 100% liquid cooling, and Cisco Secure AI Factory.
Determine 1: Cisco N9164F-NS6

To make deploying these advanced techniques simpler, we built-in Nexus One help with cloud-managed Cisco Nexus Hyperfabric. This creates a turnkey, full-stack AI answer. AI builders now not should piece collectively disparate elements and hope they operate effectively. Nexus Hyperfabric additionally now manages the newly out there Cisco N9164E-NS4-O, powered by NVIDIA Spectrum-4 Ethernet silicon. It affords pods of plug-and-play information heart materials, managed fully via the cloud, dramatically decreasing the time it takes to go from procurement to full operation. Nexus One features a sturdy, on-premises managed Nexus dashboard that could be a confirmed operations, automation information heart, and AI networking answer.

Flexibility stays essential when designing these environments. Organizations have totally different wants primarily based on their measurement, safety necessities, and present infrastructure. To accommodate this, we provide sturdy reference architectures tailor-made to particular deployment sorts:

  • Cisco Cloud Reference Structure (CRA): For enterprises and neoclouds deploying specialised AI servers with larger than 1000 GPUs and as much as 32,000 GPUs, the Cisco CRA supplies a extremely optimized, scalable path ahead utilizing industry-leading Cisco Silicon One and Cisco N9300 Sequence Switches. For deployments of fewer than 1000 GPUs, we’ve ready-to-consume Cisco Enterprise Reference Structure (ERA).
  • Compliant with NVIDIA Cloud Associate Reference Design: For giant-scale AI infrastructures starting from 1000 to 32,000 GPUs, the Cisco N9100 Sequence Switches totally adjust to the NVIDIA Cloud Associate (NCP) Reference Structure. This ensures final efficiency for sovereign and neocloud deployments, using scale-out materials powered by Spectrum-X Ethernet change silicon.
Comparison of Cisco Enterprise Reference Architecture (ERA) for <1K GPU AI servers and Cloud Reference Architecture (CRA) for 1K-32K GPU AI servers. The chart details Cisco Silicon One, N9300/N9100 series switch usage, and NVIDIA Spectrum-X Ethernet silicon requirements for each architecture.
Determine 3: Cisco Reference Architectures

By providing a selection between NCP Reference Design and Cisco CRA architectures, we make sure that each buyer has a confirmed, validated blueprint for fulfillment, whether or not they’re constructing a large sovereign cloud or a extremely focused enterprise AI cluster.

Securing the AI workload on the edge

As AI clusters develop in energy and complexity, they develop into extremely enticing targets for malicious actors. Conventional perimeter safety fashions fail in these environments. Unprotected east-west site visitors inside the AI material permits lateral threats to unfold quickly. A compromised AI workload may result in GPU useful resource hijacking or huge information exfiltration, leading to a full lateral blast radius. Nevertheless, operating conventional safety brokers instantly on the AI servers taxes the CPU and GPU, draining the very compute assets you’re making an attempt to maximise.

To unravel this, we’re essentially shifting the place and the way safety is enforced. We prolonged Cisco Hybrid Mesh Firewall help on to NVIDIA BlueField DPUs.

Network diagram: Nexus One and Hybrid Mesh Firewall secure AI workloads (VPC 1, VPC 2) on an AI server. NVIDIA BlueField security policies block a lateral attack.
Determine 4: AI workload safety for front-end material servers

This innovation brings safety as near the workload as potential with out compromising efficiency. By embedding the firewall on the DPU, we ship built-in inline safety with high-performance scalability. Directors can outline safety insurance policies as soon as and implement them in every single place, isolating digital public clouds (VPCs) and blocking lateral assaults in actual time. This supplies choke-point-free enforcement, defending front-end material servers whereas leaving the host CPU and GPU fully devoted to processing AI workloads.

To study extra about how we’re defending AI environments from lateral threats with out sacrificing efficiency, learn our detailed technical breakdown: Cisco secures AI infrastructure with NVIDIA BlueField DPUs.

Advancing AI networking for peak efficiency

Even essentially the most highly effective GPUs will sit idle if the community can’t feed them information quick sufficient. Maximizing GPU utilization and optimizing the key-value (KV) cache efficiency requires clever, high-speed connectivity throughout the whole material. That is the place Cisco Nexus One essentially adjustments the sport for AI networking.

Nexus One supplies a unified administration aircraft throughout each NX-OS and SONiC environments with an on-premises managed Nexus Dashboard and cloud-managed Nexus Hyperfabric as operational fashions.

Nexus Dashboard delivers unprecedented visibility into the full-stack AI atmosphere with Cisco N9000 Sequence Switches. Directors acquire entry to deep AI job-monitoring capabilities with Nexus Dashboard 4.2. Observe precisely how information strikes via the material, figuring out bottlenecks earlier than they impression coaching instances. The system options clever, auto-adjusting load balancing and telemetry-based congestion management. As a substitute of counting on static routing protocols that break down beneath the distinctive, elephant-flow site visitors patterns of AI workloads, the community dynamically adjusts to make sure optimum information supply.

Moreover, Nexus Dashboard permits real-time GPU well being monitoring. Community operators can see instantly into the efficiency metrics of the compute layer, guaranteeing that each costly GPU useful resource operates at most effectivity. Whether or not you’re constructing a scale-out community inside a single information heart or a scale-across community connecting a number of amenities with high-performance, low-latency hyperlinks, Nexus Dashboard ensures the community acts as a strong accelerator somewhat than a bottleneck.

Empowering the subsequent technology of innovation

The AI revolution requires extra than simply uncooked processing energy. It calls for a completely built-in ecosystem the place infrastructure, networking, and safety are designed to function as a single, cohesive unit.

Via our continued improvements with the Spectrum-X-powered Cisco N9100 Sequence Switches, the clever administration of Nexus One, and the edge-enforced safety of the Cisco Hybrid Mesh Firewall on BlueField DPUs, Cisco supplies the muse that enterprises, neoclouds, and sovereign clouds have to scale their AI ambitions. We’re constructing the networks that can energy the subsequent decade of discovery, guaranteeing they’re quick, dependable, and essentially safe.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles