xAI CEO Elon Musk has been preparing his efforts to take the lead in the AI battle for months. He has recently taken a significant step in that direction by launching a massive cluster containing 100,000 Nvidia GPUs. These GPUs will be crucial for training his next AI model.
Musk's AI obsession. After losing interest in OpenAI, the tech billionaire created his own AI company called xAI in 2023 to enter the market. By the end of the year, he launched Grok, a model with a sarcastic edge that was designed to compete with ChatGPT. Later, he attempted to win over developers by releasing most of the project as open source.
100,000 Nvidia H100 GPUs. On Monday, Musk announced the launch of the “Memphis Supercluster,” which began operations with 1,000 liquid-cooled Nvidia H100 graphics cards. In his words, it’s the “most powerful AI training cluster in the world.” The project has been carried out in collaboration with Supermicro. Its CEO, Charles Liang, congratulated Musk under his X post after launching the cluster.
Demand for graphics cards. The CEO of Tesla and SpaceX has been purchasing these cards for months for both xAI and to train the autonomous driving systems at Tesla, although it seems that some of them were recently re-allocated to X.
New xAI model in December. Musk then wrote, “This is a significant advantage in training the world’s most powerful AI by every metric,” adding that it’ll be available “by December this year.” He’s probably referring to Grok3, the third generation of a model that, for the moment, still lacks the popularity of its competitors.
Musk didn’t want to wait. In May, The Information reported that Musk was preparing a “Gigafactory of Compute,” a giant supercomputer that aims to use 100,000 Nvidia H100 GPUs. Musk reportedly considered using the new B200 cards in this endeavor, but seems to have decided against it despite the theoretical gain in power and efficiency.
This “supercomputer” would potentially claim the top spot on the TOP500 list. The massive Memphis Supercluster could suddenly emerge as the undisputed leader of the TOP500 list, a project that benchmarks and ranks supercomputers, especially when considering the number of GPUs and their power.
The world’s most powerful supercomputers don’t have as many GPUs: Frontier has 37,888 AMD GPUs, Aurora has 60,000 Intel GPUs, and Microsoft Eagle has 14,400 Nvidia GPUs. All of these are overshadowed by the new “monster.” However, it’s uncertain whether its specific focus on training AI models will secure its place in the upcoming editions of the renowned list of the world’s most powerful supercomputers.
This article was written by Javier Pastor and originally published in Spanish on Xataka.
Image | xAI
View 0 comments