Disassembling NVIDIA's 1.6T Network Module
Inside NVIDIA’s 1.6Tbps Cedar Module: A deep dive into DGX H100’s custom 400Gbps networking & cooling design for AI-scale bandwidth.
"ChipPub" Publication: 20% Discount Offer Link.
NVIDIA's upgrade from the A100 to the H100 series marked a significant shift to PCIe Gen5. PCIe Gen5 provides sufficient bandwidth to transition from a 200Gbps network to a 400Gbps network.
The NVIDIA DGX H100 adopts a different networking approach, specifically abandoning traditional PCIe cards in favor of a module called "Cedar."
Each Cedar module is equipped with four ConnectX-7 controllers, each providing 400Gbps of network bandwidth.
The DGX H100 also includes two ConnectX-7 controllers to connect two Cedar modules, each with four ConnectX-7 controllers, each at 400Gbps, resulting in a structural bandwidth of 3.2Tbps.