The Future of Humanity Institute, University of Oxford
AI agents, as they're sometimes known, can already beat us at complex board games like Go and they're becoming more competent in a range of other areas.
Now a London AI research lab owned by Google has carried out a study to make sure we can pull the plug on self-learning machines when we want to.
DeepMind, acquired by Google for a reported £400 million in 2014, teamed up with scientists at the University of Oxford to find a way to make sure AI agents don't learn to prevent, or seek to prevent humans from taking control.
The paper - titled "Safely Interruptible Agents [PDF]" and published on the website of the Machine Intelligence Research Institute (MIRI) - was written by Laurent Orseau, a research scientist at Google DeepMind, Stuart Armstrong at Oxford University's Future of Humanity Institute, and several others.
The researchers explain in the paper's abstract that AI agents are "unlikely to behave optimally all the time." They add: "If such an agent is operating in real-time under human supervision, now and then it may be necessary for a human operator to press the big red button to prevent the agent from continuing a harmful sequence of actions - harmful either for the agent or for the environment - and lead the agent into a safer situation."
The researchers, who weren't immediately available for interview, claim to have created a "framework" that allows a "human operator" to repeatedly and safely interrupt an AI, while making sure that the AI doesn't learn how to prevent or induce the interruptions.
"Safe interruptibility can be useful to take control of a robot that is misbehaving and may lead to irreversible consequences, or to take it out of a delicate situation, or even to temporarily use it to achieve a task it did not learn to perform or would not normally receive rewards for this," the authors write.
They researchers found that some algorithms, such as "Q-learning" algorithms, are already safely interruptible, while others, like "Sarsa", aren't when they're off the shelf but they can be modified relatively easily so they are. "It is unclear if all algorithms can be easily made safely interruptible," the authors admit.DeepMind's work with The Future of Humanity Institute is interesting; DeepMind wants to "solve intelligence" and create general purpose AIs, while the Future of Humanity Institute is researching potential threats to our existence. The institute is led by Nick Bostrom, who believes that machines will outsmart humans within the next 100 years and thinks they have the potential to turn against us.
Speaking at Oxford University last May at the annual Silicon Valley comes to Oxford event, Bostrom said: "I personally believe that once human equivalence is reached, it will not be long before machines become superintelligent after that. It might take a long time to get to human level but I think the step from there to superintelligence might be very quick. I think these machines with superintelligence might be extremely powerful, for the same basic reasons that we humans are very powerful relative to other animals on this planet. It's not because our muscles are stronger or our teeth are sharper, it's because our brains are better."
DeepMind knows the technology it's creating has the potential to cause harm. The founders - Demis Hassabis, Mustafa Suleyman, and Shane Legg - allowed their company to be acquired by Google on the condition that the search giant created an AI ethics board to monitor advances that Google makes in the field. Who sits on this board, and what they do exactly, remains a mystery.
The founders have also attended and spoken at several conferences about ethics in AI, highlighting that they want to ensure the technology they and others are developing is used for good, not evil. It's likely that they will look to incorporate some of the findings from the "Safely Interruptible Agents" paper into their work going forward.