Back in March, NVIDIA announced the new NVIDIA Blackwell platform, which offers up to 25 times lower cost and energy consumption than its predecessor for large language model training. Major cloud providers and leading AI startups, including Amazon Web Services, Dell Technologies, Google, Meta, Microsoft, OpenAI, Oracle, Tesla, and xAI, are expected to adopt Blackwell when it becomes available.
Due to a design flaw, the Blackwell platform's release was delayed by up to three months. Recently, Microsoft shared on X that it has already received NVIDIA GB200 Blackwell chips and is optimizing its servers for the new chips by taking advantage of NVIDIA's Infiniband networking and closed-loop liquid cooling.
Microsoft Azure is the 1st cloud running @nvidia's Blackwell system with GB200-powered AI servers. We're optimizing at every layer to power the world's most advanced AI models, leveraging Infiniband networking and innovative closed loop liquid cooling. Learn more at MS Ignite. pic.twitter.com/K1dKbwS2Ew -- Microsoft Azure (@Azure)
October 8, 2024
Microsoft CEO Satya Nadella also tweeted about the GB200 deployment:
Our long-standing partnership with NVIDIA and deep innovation continues to lead the industry, powering the most sophisticated AI workloads. https://t.co/qaEoSv8dm5 -- Satya Nadella (@satyanadella)
October 8, 2024
NVIDIA recently sent one of the first engineering builds of the DGX B200 to the OpenAI team as well:
Look what showed up at our doorstep.
Thank you to @nvidia for delivering one of the first engineering builds of the DGX B200 to our office. pic.twitter.com/vy8bWUEwUi -- OpenAI (@OpenAI)
October 8, 2024
Considering the extensive list of potential clients for NVIDIA's Blackwell platform, it's logical that Microsoft and OpenAI are the first recipients. This is because, unlike other major cloud providers like Google and AWS, they rely entirely on NVIDIA for AI training. Google utilizes its own Tensor Processing Units (TPUs) for most of its model training and even offers TPU resources to its cloud customers. Similarly, AWS has developed its own chips for training and inference. In contrast, Microsoft and OpenAI's complete dependence on NVIDIA likely positions them as NVIDIA's largest customers.
Microsoft is expected to share more details about NVIDIA GB200 deployment at its Ignite conference in November. With its impressive performance and efficiency gains, Blackwell could become the go-to solution for large language model training, further solidifying NVIDIA's dominance in the AI hardware market.