Microsoft has a new supercomputer — and it wants to use it to make AI, that can talk like humans do
May 19, 2020, 22:47 IST
Advertisement
- Microsoft has a new supercomputer that is going help with multi-tasking AI models.
- The company claims it’s the first supercomputer to use cloud technology in order to fine-tune artificial intelligence (AI) models.
- It has more than 285,000 CPU cores and 10,000 GPUs, with 400 gigabits per second of network connectivity for each GPU server.
The computer will be linked to Microsoft’s own Azure infrastructure in order to train large artificial intelligence (AI) models, it announced during the annual Microsoft Build 2020 conference.
The supercomputer is a single system with more than 285,000 CPU cores and 10,000 GPUs. On top of that, each GPU server has 400 gigabits per second of network connectivity.
The supercomputer has been built in collaboration and exclusively for the US-based OpenAI, who’s machine learning (ML) efficiency has been getting more efficient by the day — double every 16 months, outpacing even Moore’s law — according to a paper by Danny Hernandez and Tom Brown, which is yet to peer-reviewed.
“We are seeing that larger-scale systems are an important component in training more powerful models,” OpenAI CEO Sam Altman said.
Advertisement
Single task AI models vs multitasking AI models
Historically, data scientists have built smaller AI models that use labelled examples to learn a single task — like translating a language, recognising objects or to deliver the day’s weather report.
In a multitasking AI model, one model will be able to meet one than one end goal. For instance, if a model can understand the nuances of language — like the human intent behind a statement — the same algorithm can be used to analyse billions of pages of text, moderate chat content, or even generate code.
“This is about being able to do a hundred exciting things in natural language processing at once and a hundred exciting things in computer vision, and when you start to see combinations of these perceptual domains, you’re going to have new applications that are hard to even imagine right now,” said Microsoft’s Chief Technical Officer Kevin Scott, explaining how the application of the supercomputer for AI will the birth of a new class of multitasking AI models.
The recipes for training AI models will soon be open-source
Microsoft’s Turing Model for natural language generation — the largest publicly available AI language model in the world — was released earlier this year. It has over 17 billion parameters. Today, the company announced that it will begin open-sourcing its Turing models along with how-to guides on how to train them using Azure’s Machine Learning.
“This has enabled things that were seemingly impossible with smaller models,” said Luis Vargas, a Microsoft partner technical advisor who is spearheading the company’s AI at Scale initiative.
Advertisement
This way, not just Microsoft, but other developers will be able to use the models to improve language understanding across its products.
To help that long, DeepSeed, an open-source deep-learning library for PyTorch — a scientific computing package — which helps reduce the computing power required to sum large distributed model training, has also gotten an upgrade. According to Microsoft, the model can now train models which are 15 times larger and 10 times faster, as compared to DeepSeed’s earlier version which was released three months ago.
SEE ALSO:
The world's fastest supercomputer joins the battle against coronavirus — and it's speeding up the hunt for a cure