Anthropic to pay $3,000 per book in $1.5 billion AI copyright settlement - Axios

Anthropics' Approach to Creating Large Language Models Ruled as Fair Use

In a recent court decision, a judge ruled that Anthropic's approach to buying physical books and making digital copies for training its large language models was considered fair use. However, the same judge also identified that the company had illegally used copyrighted materials in its model creation process.

What is Fair Use?

Fair use is a legal doctrine in the United States that allows limited use of copyrighted material without obtaining permission from the copyright holder. The purpose of fair use is to promote creativity and innovation by enabling individuals and organizations to use copyrighted works for specific purposes, such as criticism, commentary, news reporting, teaching, scholarship, or research.

Anthropics' Approach

Anthropic, a company specializing in artificial intelligence, uses large language models to generate human-like text responses. These models require vast amounts of training data, which is typically obtained through the internet and other digital sources. However, using copyrighted materials without permission could be considered copyright infringement.

In this case, Anthropic was accused of making digital copies of physical books to train its language models. The company's defense was that it had followed fair use guidelines, as it was using the materials for educational purposes (i.e., training AI models) and not commercial gain.

The Ruling

The judge in the case ruled that Anthropic's approach to buying physical books and making digital copies was indeed considered fair use. The court stated that:

  • The purpose of creating large language models is non-commercial, as it is used for educational purposes.
  • The amount and substantiality of copyrighted materials used is minimal compared to the total content required for training.

However, the judge also identified several areas where Anthropic's approach was deemed not fair use. These include:

  • Mass copying: Anthropic made large numbers of digital copies of physical books without permission from the copyright holders.
  • Using multiple copyrighted works simultaneously: The company used multiple copyrighted works to train its models at the same time, which could be considered an infringement on individual copyrights.

Consequences

While Anthropic's approach was deemed fair use in certain aspects, the company still faced consequences for its actions. These include:

  • Paying damages: Anthropic may be required to pay damages to the copyright holders for using their materials without permission.
  • Changing its practices: The company will need to revise its methods for obtaining and using copyrighted materials to ensure compliance with fair use guidelines.

Implications

The ruling has significant implications for companies using large language models and the creators of these models. It highlights the importance of understanding copyright laws and ensuring that companies are using materials in a way that respects the rights of creators.

In conclusion, while Anthropic's approach was deemed fair use in certain aspects, the company still faces consequences for its actions. The ruling serves as a reminder for companies to be cautious when using copyrighted materials and to ensure compliance with fair use guidelines.

Fair Use Guidelines

To determine whether an action constitutes fair use, courts consider four factors:

  1. The purpose and character of the use
  2. The nature of the copyrighted work
  3. The amount and substantiality of the portion used
  4. The effect of the use on the potential market for the original work

Companies using large language models must carefully evaluate their approaches to ensure they meet these guidelines.

Best Practices

To avoid similar issues in the future, companies can follow best practices:

  • Obtain permission: Seek permission from copyright holders before using their materials.
  • Use public domain works: Utilize public domain works or creative commons licensed materials that allow for free use.
  • Limit scope of use: Restrict the scope of your usage to only what is necessary for training and not commercial purposes.

By being mindful of fair use guidelines and following best practices, companies can ensure they are using copyrighted materials responsibly and respecting the rights of creators.

Read more