Anthropic, the Amazon-backed artificial intelligence startup was hit Monday with a class-action lawsuit in California federal court over alleged copyright infringement. Three authors said in the deposition that Anthropic “built a multi-billion dollar business by stealing hundreds of thousands of copyrighted books,” including their own.
Founded by former OpenAI research executives, Anthropic has backers including; Google and Salesforce.
Authors Andrea Bartz, Charles Graeber and Kirk Wallace Johnson alleged in the lawsuit that “a key component of Anthropic’s business model — and the flagship of the ‘Claude’ family of large language models (or ‘LLMs’) — is the grand theft of protected works from copyright. ” later alleged that “Anthropic downloaded known pirated versions of Plaintiffs’ works, made copies of them, and fed those pirated copies into its models.”
The lawsuit follows Anthropic’s debut in June of its most powerful AI model to date, the Claude 3.5 Sonnet. Claude is one of the chatbots that, like OpenAI’s ChatGPT and GoogleGemini, has exploded in popularity over the past year.
“Copyright law prohibits what Anthropic has done here: downloading and copying hundreds of thousands of copyrighted books obtained from pirated and illegal websites,” the lawsuit states.
Anthropic did not immediately respond to a request for comment.
This week’s case also follows another lawsuit against Anthropic last October, in which Universal Music sued the startup for “systematic and widespread infringement of their copyrighted song lyrics,” according to a federal court filing in Tennessee. Other music publishers, including Concord and ABKCO, were also named as plaintiffs.
An example from Universal Music’s lawsuit: When a user asked Anthropic’s AI chatbot, Claude, about the lyrics to Katy Perry’s “Roar,” it produced a “virtually identical copy of those lyrics,” infringing Concord’s rights , the copyright holder, according to the filing. The lawsuit also names Gloria Gaynor’s “I Will Survive” as an example of Anthropic’s alleged copyright infringement, as Universal owns the rights to its lyrics.
“In the process of building and operating artificial intelligence models, Anthropic illegally copies and distributes vast amounts of copyrighted works,” the lawsuit said, later adding, “Just like the developers of other technologies before it, from the printing press to the copy machine in the web crawler, AI companies must follow the law.”
As the news industry struggles to maintain sufficient advertising and subscription revenue to pay for its costly newsgathering operations, many news publications and media outlets are aggressively trying to protect their businesses as AI-generated content becomes more widespread.
The Center for Investigative Reporting, the nation’s oldest nonprofit newsroom, sued OpenAI and lead backer Microsoft sued in federal court in June for alleged copyright infringement, following similar lawsuits from publications including The New York Times, The Chicago Tribune and The New York Daily News.
In December, The New York Times filed a lawsuit against Microsoft and OpenAI, alleging copyright violations related to their journalistic content appearing in ChatGPT training data. The Times said it is seeking to hold Microsoft and OpenAI accountable for “billions of dollars in statutory and actual damages” related to the “unlawful copying and use of the Times’ unique and valuable works,” according to a filing in U.S. District Court for the South. New York Region. OpenAI disagreed with the Times’ characterization of the facts.
The Chicago Tribune, along with seven other papers, followed with a suit in April.
Outside the news, a group of prominent US authors, including Jonathan Franzen, John Grisham, George RR Martin and Jodi Picoult, sued OpenAI last year, alleging copyright infringement when using their work to train ChatGPT.
But not all news organizations are preparing for a battle, and some are joining forces with AI startups.
On Tuesday, OpenAI has announced a partnership with Condé Nast, in which ChatGPT and SearchGPT will display content from Vogue, The New Yorker, Condé Nast Traveler, GQ, Architectural Digest, Vanity Fair, Wired, Bon Appétit and other outlets.
In July, Perplexity AI unveiled a revenue-sharing model for publishers after more than a month of accusations of plagiarism. Media and content platforms including Fortune, Time, Entrepreneur, The Texas Tribune, Der Spiegel and WordPress.com were the first to join the company’s “Publisher Program.”
OpenAI and Time magazine announced a “multi-year content agreement” in June that will allow OpenAI to access current and archived articles from more than 100 years of Time’s history. OpenAI will be able to display Time content in ChatGPT’s chatbot in response to user questions, according to press releaseand to use Time’s content “to improve its products” or, more likely, to train its AI models.
OpenAI announced a similar partnership in May with News Corp., allowing OpenAI to access current and archived articles from the Wall Street Journal, MarketWatch, Barron’s, New York Post and other publications. Reddit also announced in May that it would partner with OpenAI, allowing the company to train its AI models on Reddit content.