Concerns are arising that ChatGPT and other chatbots are using copyrighted or data-scrapped data and information as a source for artificial intelligence (AI) generated responses.
Key Details
- On June 28, Massachusetts-based authors Paul Tremblay and Mona Awad filed a lawsuit in a San Francisco federal court against ChatGPT creator OpenAI over claims of copyright infringement.
- Both authors claim that ChatGPT contains knowledge of their books that it would not have access to if it was not violating copyright laws, saying it generates “very accurate summaries” of their works.
- “When ChatGPT is prompted, ChatGPT generates summaries of Plaintiffs’ copyrighted works—something only possible if ChatGPT was trained on Plaintiffs’ copyrighted works. Defendants, by and through the use of ChatGPT, benefit commercial and profit richly from the use of Plaintiffs’ and Class members’ copyrighted materials,” says the lawsuit.
Why It’s Important
The launch of ChatGPT on November 30, 2022, has completely changed the world, with chatbots, AIs, and large-language models becoming the next big thing as dozens of companies and startups—including Microsoft, Google, Amazon, Apple, Meta, and Baidu—scramble to have the first and best AI solutions pushed onto the open marketplace.
That mad dash has created many concerns in the process, notably from watchdogs and critics who argue that the technology is too powerful and that it could create chaos in the wrong hands. Among the more notable critics, authors have repeatedly argued that AI tools violate copyright laws.
AI functions by predicting the most likely outcome to a given data set, and it does so by being trained on millions of existing datasets, which it can either draw from data programmed into it or drawn directly from the internet. In both cases, artists and authors have claimed that ChatGPT has shown evidence that it is sampling their work and repurposing it without credit or acknowledgment for AI-generated responses.
As we previously reported, ChatGPT has severe limitations in its ability to imitate creative writing, only being able to mimic the loose outline of a story without consistent details or meaningful character development.
Key Takeaways
OpenAI is not the only company creating concerns with how big tech is gathering data for AI. Gizmodo discovered last week that Google updated its privacy policy on July 1 to allow the tech giant to harvest and data scrape almost any public data it wants and incorporate it into its Bard AI—meaning a large percentage of anything a user has written online could become incorporated into AI training without permission.
“Google uses information to improve our services and to develop new products, features, and technologies that benefit our users and the public. For example, we use publicly available information to help train Google’s AI models and build products and features like Google Translate, Bard, and Cloud AI capabilities,” says Google.