Print Article
SHARE

In a significant development for AI developers, on May 24, U.S. District Judge William Alsup has ruled in Bartz v. Anthropic PBC that training AI models on copyrighted books may qualify as fair use under U.S. copyright law—if done for a transformative purpose. The ruling provides a measure of legal clarity for developers and data processors alike, though it draws a sharp boundary around the lawful acquisition and storage of training data.

Key Holding: AI Training is “Exceedingly Transformative”

Plaintiffs Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson alleged Anthropic unlawfully used pirated versions of their books to train its Claude AI model. Judge Alsup held that the act of training the model was not intended to replace the plaintiffs’ creative works, but rather to produce outputs that are “new, different, and non-infringing.” The court analogized AI training to a human writer studying prior works for inspiration—emphasizing that no substantial part of any plaintiff’s book was reproduced verbatim by Claude in downstream applications.

The court thus ruled that the AI training process itself qualified as fair use. For AI developers, this decision affirms that the transformation and purpose of the training process—rather than the commercial nature of the model—are critical in the fair use analysis.

Storage of Pirated Books Remains Copyright Infringement

While training use may be protected, the court also found that Anthropic’s centralized storage of more than 7 million allegedly pirated books was unlawful. According to Judge Alsup, storing entire copyrighted works in a way “unconnected to the direct purpose of training” constitutes infringement, regardless of whether those works were later used in a transformative manner. A trial is set for December 2025 to determine the damages, which could reach up to $150,000 per infringed work under the Copyright Act.

Implications for Developers, Distributors, and Telecom-Adjacent AI Platforms

This ruling has substantial implications for platform operators and service providers in telecommunications, cloud AI, and analytics spaces. AI developers should take care to:

  • Ensure that training data is sourced lawfully, preferably under license or from public domain sources;
  • Keep thorough documentation distinguishing datasets used strictly for training versus those retained for other business purposes;
  • Establish rigorous data lifecycle protocols to avoid passive storage of infringing content;
  • Consider the inclusion of indemnity clauses and warranties in vendor or licensing agreements involving third-party data.

For telecom providers who offer integrated AI services—such as customer analytics or intelligent routing—this decision underscores the need to vet third-party model suppliers and avoid exposure through indirect access to infringing data repositories.

Next Steps for Business:

While the court’s fair use determination offers a favorable signal to the AI industry, the unresolved storage claim and potential appellate review leave uncertainty. All AI stakeholders should:

  • Review internal data acquisition and storage practices;
  • Audit compliance with copyright licensing obligations;
  • Prepare for heightened scrutiny over how training datasets are acquired, stored, and shared.

 

The CommLaw Group Can Help

If your business is developing or deploying AI models, especially those trained on large language or media datasets, or if your services involve indirect exposure to third-party AI models, our firm can help you navigate the copyright, privacy, and telecom-specific implications.

Contact:

Susan Duarte – sfd@commlawgroup.com

Diana James – daj@commlawgroup.com

Brian Alexander – bal@commlawgroup.com

Ask An Attorney

Disclaimer: Please be advised that contacting our law firm through this contact form does not establish an attorney-client relationship. While we appreciate your interest in our services, we cannot guarantee the confidentiality of any information shared until an attorney-client relationship has been formally established. Therefore, we kindly request that you refrain from submitting any confidential or sensitive information through this form. Any information provided through this form will be treated as general inquiries and not as privileged or confidential communications. Thank you for your understanding.