- A lawsuit claims Google took people's data without their knowledge or consent to train its AI products.
- The lawsuit accuses Google of "secretly stealing everything ever created and shared on the internet."
A new lawsuit claims that Google has been "secretly stealing everything ever created and shared on the internet by hundreds of millions of Americans" to train its generative AI products like its chatbot Bard.
The proposed class-action lawsuit, filed by Clarkson Law Firm in the US District Court for the Northern District of California on Tuesday, accused Google, AI sister company DeepMind, and parent company Alphabet of taking people's data without their knowledge or consent.
"Google has taken all our personal and professional information, our creative and copywritten works, our photographs, and even our emails — virtually the entirety of our digital footprint" to build its AI products, the lawsuit claims.
"For years, Google harvested this data in secret, without notice or consent from anyone."
This includes data taken from subscription-based websites and from websites known for pirated collections of books and creative works, the lawsuit alleges.
The complaint also refers to an update to Google's privacy policy from July 1, which says that it may collect information that's "publicly available online" to train its AI models and build products like Google Translate, Bard, and Cloud AI capabilities.
"Google must understand, once and for all: it does not own the internet, it does not own our creative works, it does not own our expressions of our personhood, pictures of our families and children, or anything else simply because we share it online," the lawsuit says. "'Publicly available' has never meant free to use for any purpose."
Google did not immediately respond to Insider's request for comment on the suit, but in a statement given to Reuters, called the claims in the suit "baseless."
Google general counsel Halimah DeLaine Prado told Insider in a statement that the company had been "clear for years" that it used data from public sources, like that published to the open web and public datasets, to train the AI models behind services like Google Translate, "responsibly and in line with our AI Principles."
"American law supports using public information to create new beneficial uses, and we look forward to refuting these baseless claims," DeLaine Prado continued.
The lawsuit was filed around two weeks after Clarkson Law Firm lodged a similar complaint against OpenAI, alleging that the company stole "massive amounts of personal data" and used it to train ChatGPT, including medical records and information about children.
In both lawsuits, the plaintiffs were identified only by their initials, occupations, state, and internet usage, which their lawyers said was to "avoid intrusive scrutiny as well as any potentially dangerous backlash."
One of the plaintiffs in the Google lawsuit, identified with the initials "J.L." and described as a New York Times best-selling author and investigative journalist living in Texas, claimed that Google had used a stolen PDF of her book to train Bard.
The lawsuit claims that her work is now widely available for free on Bard, with the bot giving chapter summaries of the book and even sharing extracts verbatim.