BLOG
Inside the 100K GPU xAI Colossus Cluster that Supermicro Helped Build for Elon Musk-01
The basic building block for Colossus is the Supermicro liquid-cooled rack. This comprises eight 4U servers each with eight NVIDIA H100’s for a total of 64 GPUs per rack. Eight of these GPU servers plus a Supermicro Coolant Distribution Unit (CDU) and associated hardware make up one of the GPU compute racks.
Supermicro’s motherboard integrates the four Broadcom PCIe switches used in almost every HGX AI server today instead of putting them on a separate board. Supermicro then has a custom liquid cooling block to cool these four PCIe switches. Other AI servers in the industry are built, and then liquid cooling is added to an air-cooled design. Supermicro’s design is from the ground up to be liquid-cooled, and all from one vendor.
Inside the 100K GPU xAI Colossus Cluster that Supermicro Helped Build for Elon Musk-02
At the bottom of the rack, we have the CDUs or coolant distribution units. These CDUs are like giant heat exchangers. In each rack, there is a fluid loop that feeds all of the GPU servers. We are saying fluid, not water, here because usually, these loops need fluid tuned to the materials found in the liquid cooling blocks, tubes, manifolds, and so forth.
Each CDU has redundant pumps and power supplies so that if one of either fails, it can be replaced in the field without shutting down the entire rack.
French energy equipment giant Schneider Electric has removed CEO Peter Herweck. The company, a significant supplier of the data center industry, said that it had appointed long-time employee Olivier Blum as the new chief executive, effective immediately.
In a statement, Schneider said that its Board of Directors "decided to remove from office Peter Herweck as chief executive officer due to divergences in the execution of the company roadmap at a time of significant opportunities. "
Schneider's share price last month hit an all-time high, and last week it reiterated its full-year financial targets that benefit from both the AI data center boom and a wider shift to renewable energy.
The government of Singapore has committed S$270 million (US$204m) to develop the country’s supercomputing infrastructure and bolster its National Supercomputing Centre (NSCC).
Provided by the NRF, the grant will support the development of the successor to the Aspire 2A+, expected to be operational in the latter half of 2025 and allow researchers to explore the integration capabilities of classical supercomputers and quantum computers.
In July 2024, the Singaporean government signed a Memorandum of Understanding with Quantinuum to provide Singaporean institutions with access to Quantinuum’s H-Series and Helios quantum computers and collaborate on quantum computing use cases.
xAI to double Colossus compute capacity, reveals cluster uses Nvidia Spectrum-X ethernet
The xAI Colossus supercomputer in Memphis, Tennessee, is set to double in capacity in separate statements, both Nvidia and xAI CEO Elon Musk announced that the facility is in the process of adding an additional 100,000 Nvidia Hopper GPUs to the cluster.
In addition to the expansion, the two companies also revealed that rather than using Nvidia InfiniBand for its networking interconnectivity, the cluster instead relies on the Nvidia Spectrum-X ethernet networking platform for its Remote Direct Memory Access (RDMA) network.
In a statement, Nvidia said the platform has been designed to deliver “superior performance to multi-tenant, hyperscale AI factories,” with the system maintaining 95 percent data throughput, as enabled by Spectrum-X, in addition to experiencing zero application latency degradation or packet loss due to flow collisions across all three tiers of the network fabric.
Equinix plans two data centers in Bangkok, Thailand
Equinix this week announced an intention to invest approximately $500 million in Thailand in phases over the next ten years, including the recent acquisition of land in Bangkok for approximately $34m.
The newly acquired land in the Bangna area of Bangkok, covering over 18,700 sqm (201,285 sq ft), will be used to establish two Equinix International Business Exchange (IBX) data centers, providing more than 3,375 cabinets at full build-out.
The move is reportedly in response to the “increasing needs of both enterprises and major cloud service providers,” driven by Thailand’s proximity to Cambodia, Laos, Myanmar, and Vietnam and the government’s “proactive” Cloud First Policy.
Leave A Reply
LOGO
This stunning beach house property is a true oasis, nestled in a serene coastal community with direct access to the beach.
Opening Hours
Monday - Friday : 9AM to 5PM
Sunday: Closed
Closed during holidays
Contact
+18888888888
hezuo@eyingbao.com123 West Street, Melbourne Victoria 3000 Australia