+

Cookies on the Business Insider India website

Business Insider India has updated its Privacy and Cookie policy. We use cookies to ensure that we give you the better experience on our website. If you continue without changing your settings, we\'ll assume that you are happy to receive all cookies on the Business Insider India website. However, you can change your cookie setting at any time by clicking on our Cookie Policy at any time. You can also see our Privacy Policy.

Close
HomeQuizzoneWhatsappShare Flash Reads
 

A chief scientist at Microsoft says we're less than five years away from computers understanding us perfectly

Dec 4, 2015, 06:41 IST

Advertisement
Microsoft Distinguished Engineer Xuedong HuangMicrosoft

As frustrated as you might get that voice-controlled tools like Apple Siri and Microsoft Cortana don't always understand you, it used to be a lot worse.

Earlier this year, Google announced it had gotten its speech recognition error rate down to 8%. 

But Microsoft Distinguihsed Engineer and Chief Scientist of Speech Xuedong Huang says that it's a vast improvement.

When Microsoft made its first-ever speech recognition technology available alongside Windows 95, a project Huang headed up, the error rate was "almost 100%," he says.

If you chart it out, Huang says, that means that on average, speech recognition has gotten 20% better every single year for the last twenty years. Which means that the end is in sight.

Advertisement

"In the next four to five years, computers will be as good as humans" at understanding the words that come out of your mouth, Huang says.

But for Huang, Microsoft, and the tech world in general, the end of this road is the beginning of the next phase: building real artificial intelligence. 

"Invisible revolution"

With "total parity" between human and computer understanding on the visible horizon, Huang says, it means that the world of speech science has a firmer foundation on which to work on giving computers actual artificial intelligence.

"To understand a word is easier than understanding the context," Huang says. 

But with tools like Microsoft Cortana, Google Now, Apple Siri, and Amazon Alexa, we have consumer-facing apps that are slowly but surely getting better at figuring out not only what you said - but also what you meant. It means that you can start to have more complex conversations with your gadgets.

Advertisement

This means that we're on the cusp of an "invisible revolution," Huang says, as speech becomes an accepted and useful interface for computers, and artificial intelligence becomes a reality.

It's something that's been a long time coming for Microsoft. Witness Bill Gates demonstrate the Microsoft MiPad (seriously), a device running prototype voice recognition software created by Huang's team, at the 2001 Consumer Electronics Show:

The MiPad never came to market. But the world of speech technology marches on.

The world of tomorrow

Project Oxford is available to developers everywhere, meaning developers can put the technology in their own apps. 

And just as Microsoft Cortana can listen to your spoken questions and give smart answers, Project Oxford lets developers of consumer apps, business software, and everything in between build technology with whom you can hold a decent conversation. 

Microsoft

It means the rise of one interface - speech - that can control every kind of device, anywhere in a home. And with Microsoft Project Oxford, and the Microsoft Azure cloud that underpins it, Huang says Microsoft is in a great position to be at the center of the revolution.

"It took us 20 years to reach that goal," Huang says.

Advertisement

Even as Microsoft works on AI, the company has already started to look ahead to what's next, Huang says.

Indeed, he says that the Xbox Kinect sensor, which let people control Xbox video games with voice and motion, was actually born of Microsoft Research's first cracks at building systems that could understand both speech and gestures to discern meaning. 

Eventually, this will just be a new normal, Huang believes. Kids will grow up with these kinds of artificial intelligence systems, and they will be taken for granted  as standard methods for interacting with technology.

"We are creating a new generation," Huang says.

NOW WATCH: 11 things you can ask Siri to get the most bizarre and hilarious answers

Please enable Javascript to watch this video
You are subscribed to notifications!
Looks like you've blocked notifications!
Next Article