scorecard
  1. Home
  2. Politics
  3. National Security
  4. This former Netflix and Amazon engineer's startup is releasing a free tool to help developers purposely and automatically break their software

This former Netflix and Amazon engineer's startup is releasing a free tool to help developers purposely and automatically break their software

Rosalie Chan   

This former Netflix and Amazon engineer's startup is releasing a free tool to help developers purposely and automatically break their software
Politics4 min read

founders

Gremlin

Gremlin cofounders Kolton Andrus, CEO, and Matthew Fornaciari, CTO

  • On Wednesday, the startup Gremlin launched a free tool that allows engineers to purposely and automatically break their software and cloud infrastructure - which it champions as the best way to test for vulnerabilities and weak points.
  • Gremlin is a major proponent of Chaos Engineering, a philosophy that was pioneered at Netflix to help it make sure that its systems could stand up to the rigors of millions of binge-watching viewers.
  • Gremlin CEO and cofounder Kolton Andrus explains how he was inspired by what he learned at Netflix to bring Chaos Engineering to more software teams, everywhere.

Netflix is famous in software circles for its philosophy of Chaos Engineering - a principle that holds the best way to test the durability of cloud infrastructure is to purposely try to destroy it and see how it holds up.

To that end, Netflix created Chaos Monkey, a tool that unleashes an army of virtual monkeys into the cloud, shutting down bits and pieces of its cloud architecture at random. The idea was that if Netflix's cloud could withstand this simulated simian assault, it could hold up to even the most relentless crush of binge-watching users. Since then, Chaos Engineering has caught on, and Netflix's growing "Simian Army" has become popular with developers.

Read more: Netflix Unveils Its Monkey Army

Kolton Andrus, the CEO and co-founder of startup Gremlin, witnessed the power of Chaos Engineering for himself during his time at Netflix and Amazon. Now, he wants to bring it to more users: Gremlin on Tuesday announced a free version of its namesake service, giving users an easy way to try what it calls "Chaos Monkey-as-a-service."

To date, Gremlin has raised $26.8 million since its founding in 2016. While this is Gremlin's first free offering, it counts Expedia, Twilio, and Walmart among customers of its paid, premium product.

Andrus says that for Gremlin's next act, he wants to remove the barriers to letting companies try Chaos Engineering out for themselves. Customers benefit from having such a good sense of their software's weaknesses, he says, but don't always want to make the investment. By releasing a free version of Gremlin, it removes those barriers.

"[Companies] know there are things they need to get better at," Andrus said. "This tool helps them understand these weaknesses, but because there's an issue of time and money, and customers say, 'I just don't have time for it.' If you unlock the ability for engineers to get their hands dirty, they can get right to it."

Chaos engineering

Chaos Engineering works similarly to vaccination. Like injecting a virus into your body to strengthen the overall immune system, chaos engineers inject attacks into a computer system to improve their response to major incidents. Things always go wrong, Andrus says, and forewarned is forearmed.

"Our systems are complex and they're going to surprise us," Andrus said. "We'd rather be surprised during the day while the caffeine has kicked in than being surprised in the middle of the night."

Proactively finding out about vulnerabilities can prevent expensive system downtime, which Gartner estimates can cost $5,600 per minute. Andrus says that he hears from customers that Chaos Engineering isn't right for them, but that they want to find some way to improve their system's resilience.

"To me, that's like saying, I want to go to the gym, but I want to lose 10 pounds before going to the gym," Andrus said. "Their order of operation is backwards."

Learning from Netflix and Amazon

Andrus says he "selfishly" thinks that Gremlin's version of Chaos Monkey is better.

"The Gremlin product is a combination of what we learned at Amazon and what we learned at Netflix," Andrus said. "We know people are using Chaos Monkey and that's where they begin, but let's give them a better version for free."

That being said, Andrus learned important concepts that have influenced Gremlin's product decision today. At Amazon, Andrus learned to build a good self-service tool that carried a user interface that made developers want to use it, even if it wasn't mandatory.

"We didn't tell the engineers at Amazon they had to use it," Andrus said. "We just built it and made it available."

At Netflix, Andrus learned the concept of the "blast radius," which is how far an experimental attack should go without impacting customers. This idea helps engineers contain these attacks, and make sure the monkeys don't go too far out of control.

"Everything in Gremlin was taken from learnings of building the same systems for Netflix or Amazon before," Andrus said. "We're doing it in a way that will work for any company. My tagline there is, if one of the things we learned is we want engineers to do the right thing, we need to make it easy."

READ MORE ARTICLES ON


Advertisement

Advertisement