Microsoft built a supercomputer for OpenAI to train large AI models

Listen to the Podcast:

To train huge model sets, Microsoft has created a supercomputer for the OpenAI artificial intelligence (AI) research startup. The company incorporated thousands of Nvidia A100 graphics chips to help ChatGPT and Bing AI chatbots.

The Windows maker promised to build a “massive, cutting-edge supercomputer” in exchange for its billion-dollar investment in OpenAI.

Why did Microsoft build a supercomputer?

The purpose of this supercomputer is to offer the computational power required to train and retrain an increasing number of AI models using massive amounts of data over extended periods.

Nidhi Chappell, Microsoft’s product manager for Azure High Performance Computing and Artificial Intelligence, said: “One of the things we learned from the research is that the bigger the model, the more data it has, and the longer it can train, the higher the precision of the model is.

He went on to say that in addition to having the most outstanding infrastructure, you also need to be able to run it reliably over a long period of time. “There was also a tremendous push to train larger models for a longer period.

Microsoft said it is building a supercomputer for OpenAI that would be hosted on Azure and used exclusively for training AI models at its Build developer conference in 2020.

Greg Brockman, President and Co-Founder of OpenAI, said: “Co-designing supercomputers with Azure has been necessary to scale our demanding AI training demands, enabling our research and alignment work on systems like ChatGPT.

See also  Elon Musk finally completes $44 billion acquisition of Twitter

Microsoft supercomputer architecture?

Microsoft modified the way it racked servers to avoid power outages and connected tens of thousands of Nvidia A100 graphics chips to train AI models.

This time. This is how ChatGPT became possible. A model that resulted from it is that. According to Chappell, quoted by Bloomberg, there will be much more.

Scott Guthrie, Microsoft’s executive vice president responsible for cloud and AI, said the cost of the project is “definitely more” than several hundred million dollars.

Microsoft Azure is getting more power

OpenAI also requires a powerful cloud architecture in addition to a supercomputer to process and train models. Microsoft is already working to enhance the AI ​​capabilities of Azure Cloud with new virtual machines using Quantum-2 InfiniBand networking and Nvidia H100 and A100 Tensor Core GPUs.

With this, Microsoft will make it possible for OpenAI and other companies that rely on Azure to train larger and more complex AI models. Some companies are already implementing AI chatbots.

OpenAI was one of the first test points that demonstrated the need to create special-purpose clusters focused on supporting huge training workloads. Eric Boyd, corporate vice president of Azure AI at Microsoft, continued: “We worked closely with them to identify what were the important things they were looking for as they built their training environments and what were the essential things they needed.

Subscribe to our latest newsletter

To read our exclusive content, sign up now. $5/month, $50/year

Categories: Technology

Leave a Comment