Reddit to AI companies: Pay up if you're using our content
- Reddit said it plans to start charging companies for access to its data that is used to train AI.
- Major players in the AI race like OpenAI, Microsoft, and Google train AI models using Reddit's data, NYT reported.
Reddit knows its data is valuable in the AI race — and now it plans to charge companies for access to it.
"We are introducing a new premium access point for third parties who require additional capabilities, higher usage limits, and broader usage rights," Reddit announced on its blog.
A Reddit spokesperson told Insider that as the company "expands globally, we are working to create a more sustainable, healthy ecosystem around data."
The spokesperson said Reddit is currently working on finalizing costs for access to its API, or application programming interface — the way two software programs communicate with each other.
"The Reddit corpus of data is really valuable," Steve Huffman, cofounder and CEO of Reddit, told The Times. "But we don't need to give all of that value to some of the largest companies in the world for free."
Companies such as OpenAI, Microsoft, and Google, who are all developing generative AI models, have used their access to Reddit's API to train their LLMs, or large language models, including ChatGPT, The New York Times reported.
OpenAI, Microsoft, nor Google immediately responded to Insider's request for comment ahead of publication.
Huffman told The Times that data from Reddit is constantly new, making it valuable for models to give better and more relevant answers.
"More than any other place on the internet, Reddit is a home for authentic conversation," Huffman said. "There's a lot of stuff on the site that you'd only ever say in therapy, or AA, or never at all."
The company said its "data API will still be open for reasonable and appropriate use cases and accessible" on its developer platform. Huffman told The Times that Reddit's API will still be free for developers building applications to help people with using Reddit. Researchers using Reddit's data for studying or other noncommercial reasons will also have free access, The Times reported.
Most developers and third parties who use Reddit's API have been notified by email, the company said.
"Crawling Reddit, generating value and not returning any of that value to our users is something we have a problem with," Huffman told The Times. "It's a good time for us to tighten things up."