+

Cookies on the Business Insider India website

Business Insider India has updated its Privacy and Cookie policy. We use cookies to ensure that we give you the better experience on our website. If you continue without changing your settings, we\'ll assume that you are happy to receive all cookies on the Business Insider India website. However, you can change your cookie setting at any time by clicking on our Cookie Policy at any time. You can also see our Privacy Policy.

Close
HomeQuizzoneWhatsappShare Flash Reads
 

Google engineers have built a photo recognition system that can outperform the most well-travelled humans

Feb 26, 2016, 16:18 IST

wilbanks/Flickr

A pair of Google employees have built a system called PlaNet that attempts to pinpoint the location of where a photograph was taken by analysing the pixels it contains.

Advertisement

Humans typically struggle to determine where generic photos were taken just by looking at them. If shown a picture of a white sandy beach, for example, they might assume it was taken in the Caribbean when in fact it was taken in the Maldives.

While many humans need a landmark to refer to - such as the Statue of Liberty or Machu Pichu - before they can pinpoint a location, Google's PlaNet system, which is still in its early stages, does not have this problem.

Tobias Weyand and James Philbin, a pair of software engineers at Google, teamed up with developer Ilya Kostrikov to build the PlaNet system. "We think PlaNet has an advantage over humans because it has seen many more places than any human can ever visit and has learnt subtle cues of different scenes that are even hard for a well-travelled human to distinguish," Weyand told MIT Technology Review.

Weyand's team divided the world into a grid made up of 26,000 squares of varying size, depending on the number of images taken in that location. Each square represented a specific geographical area.

Advertisement

The team then created a database of geolocated images from the internet to determine the gid square in which each image was taken. Overall, 126 million images were used.

Weyand and his team took 91 million of these images to teach a powerful neural network - a computer system modelled on the human brain - to work out the grid location using only the image itself. Ultimately they want to be able to put an image into the neural net and get out a particular grid location or at least a set of likely candidates. The neural network was validated with the remaining 34 million photos in the data set.

In order to test PlaNet, the Google team took 2.3 million geotagged images from online photo library Flickr and asked PlaNet to identify their location.

"PlaNet is able to localise 3.6% of the images at street-level accuracy and 10.1% at city-level accuracy," Weyand's team wrote in their academic paper.

The results weren't perfect but PlaNet still outperformed some of the most well-travelled humans on a Google Street View test.

Advertisement

On average, PlaNet guessed where a photo was taken to within 1,131.7km, while 10 well-travelled humans were only able to guess to within 2,320.75km, on average.

"In total, PlaNet won 28 of the 50 rounds with a median localisation error of 1131.7 km, while the median human localisation error was 2320.75 km," Weyand's team wrote. "[This] small-scale experiment shows that PlaNet reaches superhuman performance at the task of geolocating Street View scenes."

NOW WATCH: John McAfee explains why an iPhone backdoor is a terrible idea

Please enable Javascript to watch this video
You are subscribed to notifications!
Looks like you've blocked notifications!
Next Article