The Lawsuits Begin...
A storm of legal battles are brewing as well-known comedian and author Sarah Silverman, along with popular authors Christopher Golden and Richard Kadrey, launch lawsuits against tech giants OpenAI and Meta. The claim? Copyright infringement.
The triad of authors has filed separate suits in a US District Court against OpenAI and Meta, accusing both companies of unlawfully using their copyrighted works for training artificial intelligence models, specifically ChatGPT by OpenAI and LLaMA by Meta.
The legal complaints specify that the defendants leveraged datasets gathered from "shadow library" websites, including Bibliotik, Library Genesis, and Z-Library, among others, for training their respective AI models. According to the authors, their works were part of these datasets and were obtained without their consent.
Silverman, Golden, and Kadrey have remained relatively silent on the matter, offering no comments to the media as of yet. However, in their suit against OpenAI, the authors present proof that when prompted, ChatGPT can summarize their books. They argue this constitutes a clear violation of their copyrights. Featured works in the suit include Silverman's "Bedwetter," Golden's "Ararat," and Kadrey's "Sandman Slim."
The second lawsuit, aimed at Meta, claims that the authors' copyrighted works were available in datasets used to train Meta's LLaMA models, four open-source AI models that the company unveiled in February. The plaintiffs allege that these datasets originated from illegal sources. One source, named ThePile, was created by a company called EleutherAI and has been linked to the aforementioned "shadow libraries."
The authors maintain that they did not grant permission for their works to be used as training material for these AI models. The lawsuits include six counts of various copyright violations, negligence, unjust enrichment, and unfair competition, with the authors seeking statutory damages, profit restitution, and more.
Joseph Saveri and Matthew Butterick, the legal representatives for the authors, acknowledge that they have heard from a plethora of writers, authors, and publishers expressing concerns over ChatGPT's ability to generate text eerily similar to copyrighted materials.
In addition to representing these authors, Saveri has initiated legal proceedings against AI companies on behalf of programmers and artists. Furthermore, Getty Images has filed a similar lawsuit against Stability AI for allegedly training its AI image generation tool, Stable Diffusion, on copyrighted images. Saveri and Butterick are also working with authors Mona Awad and Paul Tremblay on a similar case.
These ongoing legal battles are more than just a mere inconvenience for OpenAI, Meta, and other AI companies. They present a significant challenge to the boundaries of copyright law. As we've previously discussed on The Vergecast, we can expect to see an increasing number of lawsuits surrounding these issues for years to come. In essence, the advent of AI technology is pushing us into uncharted legal waters, and it's clear the wave of litigation is only just beginning.