+

Cookies on the Business Insider India website

Business Insider India has updated its Privacy and Cookie policy. We use cookies to ensure that we give you the better experience on our website. If you continue without changing your settings, we\'ll assume that you are happy to receive all cookies on the Business Insider India website. However, you can change your cookie setting at any time by clicking on our Cookie Policy at any time. You can also see our Privacy Policy.

Close
HomeQuizzoneWhatsappShare Flash Reads
 

AI models from OpenAI and other tech giants are being bombarded by a new swarm of bots 'extracting intelligence'

Sep 21, 2023, 22:18 IST
Business Insider
Vercel CEO Guillermo RauchVivan Cromwell/Zeit
  • Vercel CEO Guillermo Rauch spotted a new breed of bot recently.
  • These bots scrape information from AI models, including OpenAI's GPT-4.
Advertisement

Powerful AI models, such as OpenAI's GPT-4, are being bombarded by digital bots that are "extracting intelligence" in new and nefarious ways.

The phenomenon was spotted recently by Guillermo Rauch, CEO of Vercel, a startup that helps developers build websites that integrate with many of the biggest AI models.

He discussed this new breed of bot on the No Priors podcast with venture capitalists Elad Gil and Sarah Guo.

"It's almost like, extracting intelligence," Rauch said. "Let's call it web scraper 2.0. I run a bot that tries to get free GPT-4 basically."

It's a huge problem, he added, so I called him up to delve deeper.

Advertisement

'Threat of model distillation'

The generative AI boom has sparked unprecedented demand for quality data. AI models need this content for training. Without it, the technology just isn't as good. And there's not enough to go round.

Rauch says this is one driver of these new bots. If you can cleverly scrape the outputs of GPT-4, Llama 2 and other powerful AI models, then you could use that as fresh training data for your own model, he explained.

"There's a threat of model distillation," he said. "AI models can, in theory, share everything they know. It's plausible that you can train another model based on 100,000 high quality outputs from GPT-4, for example."

Indeed, several of the top AI companies, including OpenAI, Google and Anthropic, ban the use of their outputs for training other models.

A surprise $35,000 OpenAI bill

Another reason: It's increasingly expensive to use the top-performing models. OpenAI and other tech companies have rate limits where even paying users can only ask a limited number of questions per minute or per day.

Advertisement

Instead of abiding by these rules, bad actors are creating bots that bombard models with questions and leave someone else paying the bill for all the answers. This is often done by infiltrating applications that have official accounts and API connections with the largest and most powerful AI models, Rauch explained.

"A lot of people are writing bots that try to go after web applications that rely on AI," he said. "These are essentially proxies to pull out this information, sometimes on behalf of users who are not paying to access the models."

One developer Rauch knows was a victim of this type of attack. She has an application for data scientists that queries a major large language model. Bots attacked and essentially used her app as a proxy to access the AI model.

"She ran up a $35,000 OpenAI bill in a very short time," Rauch said. "She spent months trying to explain that this wasn't her usage. Eventually OpenAI refunded her." OpenAI didn't respond to a request for comment.

Evading China's AI model blockade

A third reason for this new phenomenon: China blocked access to ChatGPT, GPT-4 and many of the other top generative AI models. Creating a bot that secretly collects all the best outputs is one way to get around that country's censorship, Rauch explained.

Advertisement

Hundreds of thousands of AI applications are being deployed on Vercel's platform each month right now. So there are a lot of targets for these new bots.

Vercel offers technology to help developers protect against these attacks.

SaaS businesses at risk

Rauch also sees SaaS businesses being challenged by this phenomenon. These types of companies often sell per-seat subscriptions that cost maybe $5 or $10 a month for unlimited use.

New AI versions of SaaS services that query large AI models could be attacked by bots and end up paying for outputs that their real customers aren't getting, he explained.

"Your SaaS business might find itself upside down and losing money," Rauch said. "So there will be more usage-based charging. A platform fee, a seat and a per-token charge or a per-query charge."

Advertisement

Vercel already integrates rate limits for developers, he noted. So an an app could offer a seat where the user can only query AI models a certain number of times per day.

"That stops attacks by outside bots that will do massive numbers of requests to steal intelligence," Rauch said.

You are subscribed to notifications!
Looks like you've blocked notifications!
Next Article