Nvidia's Blackwell AI Chips Face Overheating Issues in Servers
Sign up for ARPU: Stay ahead of the curve on tech news.
Nvidia's highly anticipated Blackwell AI chips are reportedly encountering overheating problems within server racks, potentially delaying the deployment of new data centers for some customers. Reuters, citing The Information, reported that this issue adds to previous delays experienced by the chipmaker.
The reports, citing sources familiar with the matter, state that the Blackwell GPUs overheat when multiple chips are connected within server racks designed to hold up to 72 chips. Nvidia has reportedly requested design changes from its suppliers multiple times to address the overheating problem.
"Nvidia is working with leading cloud service providers as an integral part of our engineering team and process. The engineering iterations are normal and expected," an Nvidia spokesperson told Reuters in a statement.
The Blackwell chips, unveiled in March, were initially slated to ship in the second quarter of 2024 but experienced delays. These delays could impact major customers such as Meta Platforms, Google, and Microsoft, all of whom are significant investors in AI infrastructure.
Nvidia's Blackwell chip boasts a significant performance improvement over its predecessor, utilizing two silicon squares to achieve a 30-fold speed increase in tasks like chatbot responses. However, the overheating issue presents a major challenge that could impact the rollout and adoption of this crucial new technology.