The maker of the ChatGPT artificial intelligence tool, OpenAI, is being sued by two writers who say that by “training its model on books without their permission”, the company has violated copyright rules.
The complaint for a class action lawsuit was filed last week in a federal court in San Francisco by authors Mona Awad and Paul Tremblay. Some of Mona Awad’s works include Bunny and 13 Ways to Look at a Fat Girl. Paul Tremblay is the author of The Cabin at the End of the World.
ChatGPT allows users to send questions and commands to a chatbot, which then responds with text formatted in a manner analogous to human language patterns. The model used to power ChatGPT is trained with data freely available on the internet.
According to the lawsuit, Awad and Tremblay believe their copyrighted publications were illegally “taken” and “used to train ChatGPT” because the chatbot provided “highly accurate summaries” of the novels. This belief is based on the fact that ChatGPT generated “highly accurate summaries” of the books. The case contains a number of exhibits, some of which are sample summaries.
According to Andres Guadamuz, a reader in intellectual property law at the University of Sussex, this is the first copyright case against ChatGPT. The lawsuit has been filed against ChatGPT. According to him, the case will explore the murky “boundaries of legality” of various operations taking place in the field of generative AI.
According to an email sent to the Guardian by the writers’ lawyers, Joseph Saveri and Matthew Butterick, books tend to contain “high-quality, well-edited, extensive prose,” making them a perfect medium for expressing great language. to practice. models.
The complaint alleges that OpenAI is “unfairly” profiting from “stolen writings and ideas” and seeks monetary damages on behalf of all US-based authors whose works have allegedly been used to train ChatGPT. Authors of copyrighted works enjoy “extensive legal protection” according to Saveri and Butterick. Yet they have to deal with companies like OpenAI that pretend these laws don’t apply to them.
Even if the claim that ChatGPT was trained on copyrighted material turns out to be true, it may be difficult to prove that writers suffered specific financial losses as a direct result of ChatGPT being trained on copyrighted material. According to Guadamuz, ChatGPT could perform “exactly the same” if it hadn’t swallowed the books. This is because it is educated on a variety of information taken from the Internet, including, for example, Internet users discussing novels.
According to Lilian Edwards, Professor of Law, Innovation and Society at Newcastle University, “The likely outcome of this case will depend on whether courts consider the use of copyrighted material in this way as ‘fair use’ or simply unauthorized copying. . Both Edwards and Guadamuz emphasize that a similar action in the UK would not be judged in the same way as the UK does not have “fair use” defenses in the same way as the US.
The UK government’s efforts to promote a copyright exception specifically aimed at facilitating the free use of copyrighted material for text and data mining have run into an obstacle. According to Edwards, a prominent figure in the field, this reform met strong opposition from authors, publishers and the music industry, who expressed deep displeasure with the proposed changes.
ChatGPT, the revolutionary AI technology, made its debut in November 2022. Since its launch, the publishing house has been buzzing with fervent discussions around protecting authors from the potential dangers of this advanced AI innovation. In a recent development, The Society of Authors (SoA) released a comprehensive list of “practical steps for members” to ensure the safety and protection of both themselves and their valuable work.
In a recent interview with the esteemed trade journal The Bookseller, Nicola Solomon, the chief executive of the Society of Authors (SoA), expressed her satisfaction with authors taking legal action against OpenAI. Solomon emphasized that the SoA had deep concerns about the extensive replication of authors’ work for the purpose of training comprehensive language models.
The lawyers also said it is “ironic” because tools for “so-called ‘artificial intelligence'” rely on human-created data and that this is something that is considered “ironic”. “The systems they use depend entirely on people’s inventiveness. If they bankrupt the human creators, they will soon go bankrupt themselves.