+

Cookies on the Business Insider India website

Business Insider India has updated its Privacy and Cookie policy. We use cookies to ensure that we give you the better experience on our website. If you continue without changing your settings, we\'ll assume that you are happy to receive all cookies on the Business Insider India website. However, you can change your cookie setting at any time by clicking on our Cookie Policy at any time. You can also see our Privacy Policy.

Close
HomeQuizzoneWhatsappShare Flash Reads
 

How Facebook code infiltrated the Fortune 50

Nov 6, 2015, 01:16 IST

MOUNTAIN VIEW, CA - DECEMBER 12: Mark Zuckerberg is a presenter at the 2014 Breakthrough Prizes Awarded in Fundamental Physics and Life Sciences Ceremony at NASA Ames Research Center on December 12, 2013 in Mountain View, California.(Photo by Steve Jennings/Getty Images for MerchantCantos)

When you're running at Facebook's scale, you're going to run into problems that no other tech company has ever encountered before.

Advertisement

Which means that it falls on Facebook itself to build the tools it needs to handle the massive amounts of data it has to crunch every day.

Enter Facebook Presto, a data-crunching tool built in-house at the social network.

When Presto was first revealed in 2013, Facebook's analysts and engineers were using it to ask questions of its then-300 petabyte large data warehouse and get answers fast.

Released by Facebook as open-source code, the technology has spread beyond the social network's confines and into major organizations such as Netflix and NASDAQ, which value the tool's flexibility when dealing with mountains of data. Its rapid adoption highlights Facebook's growing influence and ability to shape the cutting-edge technology that powers today's internet economy.

Advertisement

More than 90 outside developers have volunteered their time to improve Presto over the last two years, bolstering Facebook's in-house efforts, according a blog post released today.

Presto, change-o

The magic of Presto is that it presents a massively more efficient way to deal with data at large scales, says Jay Tang, who leads Facebook's "interactive analytics infrastructure."

Hot open source technologies like Apache Hadoop and Apache Hive sparked the so-called "big data" revolution, giving companies a vastly more efficient way to process large quantities of information.

Facebook uses both of those technologies, Tang told Business Insider. But the problem is that Hadoop and Hive are optimized for reliability - not speed.

"Running a query," the technical term for asking a question of a database, isn't impossible on Hive, but it often requires copying the data elsewhere and processing it to make it more digestible by data experts.

Advertisement

Facebook

Given how much emphasis Facebook puts on "moving fast," it really harshes the vibe when engineers can only run a few queries a day on their data.

"Presto is trying to solve a very specific problem," says Tang.

But Tang emphasizes Presto's "very unique architecture," which brings the mountain to Mohammed, so to speak.

Rather than shuffling the data around, Presto can read Hadoop, Hive, and other databases, right where it sits. There's no data shuffling to do; Presto can just read it and understand it, letting researchers use the SQL querying language they already know.

Advertisement

"Presto gives you the ability to query data wherever it lives," Tang says.

Beyond Silicon Valley

When Facebook first released Presto, Tang says, its main appeal was to those few developers on the bleeding edge.

But thanks in large part to the mobile revolution, companies of all sizes are dealing with ever-growing sets of data, and are starting to run into the same problems that Facebook solved years ago.

"A lot of companies are facing the same problem," Tang says.

For example, Airbnb has turned to Presto to build Airpal, a tool to quickly put access to data right in front of employees. Gree, a Japanese social gaming giant, uses Presto because it integrates more smoothly with the Hadoop and other data center infrastructure they have in place.

Advertisement

And NASDAQ and Netflix have combined Presto with Amazon Web Services to get more efficient usage of their cloud infrastructure.

CEO of Netflix Reed Hastings attends the Allen & Co. annual conference at the Sun Valley Resort on July 11, 2013 in Sun Valley, Idaho. The resort is hosting corporate leaders for the 31st annual Allen & Co. media and technology conference where some of the wealthiest and most powerful executives in media, finance, politics and tech gather for weeklong meetings. Past attendees included Warren Buffett, Bill Gates and Mark Zuckerberg.Kevork Djansezian/Getty Images

Tang promises that really large "Fortune 50" companies are using Presto, too, but they're gun-shy about sharing the details.

But companies like Teradata and MicroStrategy recently announced support for Presto in their commercial data software offerings, building out the stuff that can make it more appealing to the largest enterprises.

Crucially, Tang says, they contribute back the data connectors that they develop for Presto under that open source model, improving the core project and furthering its overall usefulness. Thanks to the Presto community's efforts, it now has a "set of rich connectors," Tang says.

Advertisement

"You definitely need a vibrant, open community," Tang says.

NOW WATCH: We asked a bunch of kids what they think about Facebook

Please enable Javascript to watch this video
You are subscribed to notifications!
Looks like you've blocked notifications!
Next Article