OpenAI's Training Data for LLMs Allegedly Comprised of Copyrighted Books
Two authors alleged in a class action lawsuit OpenAI infringed authors' copyrights by incorporating illegal "shadow libraries" offering copyrighted books without permission in the training data of its generative LLMs, such as ChatGPT.