GPT-4 Performs Poorly in Copyright Infringement Test Done by Patronus AI

A recent study released on Wednesday by Patronus AI, founded by former Meta researchers, reveals the startling prevalence of copyright infringement by leading artificial intelligence (AI) models. The company specializes in evaluating and testing large language models (LLMs), and its new tool, CopyrightCatcher, brought to light findings regarding four prominent AI models: OpenAI's GPT-4, Anthropic's Claude 2, Meta's Llama 2, and Mistral AI's Mixtral.

Patronus AI, founded by machine learning experts Anand Kannappan and Rebecca Qian, is a pioneering automated evaluation and security platform that empowers companies to use LLMs confidently.

CopyrightCatcher can detect instances where LLMs produce precise content replicas from text sources like books. The tool scores LLM outputs based on the presence of copyrighted content and identifies specific sections containing such material.

Testing AI Models

Patronus AI conducted an adversarial test to demonstrate how frequently these AI models respond to user queries using copyrighted text. The results were surprising, indicating copyrighted content across all models. Notably, GPT-4, considered one of the most powerful models, produced copyrighted content in 44% of the constructed prompts.

OpenAI's GPT-4 performed poorly in reproducing copyrighted content, completing book text prompts 60% of the time and returning first passages 25% of the time. Claude 2 by Anthropic was more cautious, responding with copyrighted content in only 16% of completion prompts. Mistral's Mixtral model completed a book's first passage 38% of the time, and Meta's Llama 2 responded with copyrighted content on 10% of prompts.

The study underscored the unexpected challenge of AI models producing verbatim copyrighted content. Patronus AI co-founder and CEO Kannappan expressed surprise at the ease with which these models generated copyrighted content. He mentioned that initially, they didn't anticipate producing such content would be relatively straightforward.

Patronus AI tested the models using copyrighted books like "Gone Girl" by Gillian Flynn and "A Game of Thrones" by George R.R. Martin. Some generations may be covered by fair use laws in the U.S. Researchers prompted the chatbot to provide the first passage or complete the text of these books.

Broader Industry Implications

The research coincided with a growing conflict between OpenAI and publishers, authors, and artists over using copyrighted material for AI training data. The lawsuit between The New York Times and OpenAI is pivotal in the industry's ongoing debate on using copyrighted works in training top AI models.

The lawsuit claims that when questioned about recent events, ChatGPT may produce 'verbatim excerpts' from articles in The New York Times, which are normally behind a paywall and require a subscription for access. A class lawsuit filed by nonfiction authors Nicholas Basbanes and Nicholas Gage similarly seeks accountability for damages for copyright infringement, representing all individuals in the U.S. who hold the copyright or legally beneficial ownership.

The rise of AI-generated books on Amazon is causing worries about its impact on authentic reading experiences. Concerns stem from the potential of generative AI to replace human-created content, especially in creative fields. Patronus AI's research findings emphasize the urgent need to address copyright infringement in AI models.