CASE BACKGROUND
Apple has created a set of generative AI models collectively called Apple Intelligence that it provides to consumers in its phones, tablets, and personal computers. It has allegedly used authors’ copyrighted works without their authorization or compensation to train and test these models. This is an impermissible use far beyond any applicable license Apple has to sell such books to the users of its products.
Apple reproduced and used datasets such as Books3—consisting of pirated, copyrighted books that include the published works of plaintiffs and the class—to train language models that power Apple Intelligence. Books3 is a notorious “shadow library” that can be found in various places on the internet or shared and downloaded from pirate websites and file-sharing protocols.
Apple also uses “Applebot,” a web-crawling software program that copies mass quantities of internet data (also known as “scraping”) to use as training data. Apple scraped data with Applebot for nearly nine years before disclosing that it intended to use the scraped data to train its AI systems. Web crawlers like Applebot scrape shadow libraries, including but not limited to Books3, that host millions of other unlicensed copyrighted books, including plaintiffs’ and class members’ copyrighted works.
Generative AI models like those used in Apple Intelligence are only as good as the training data on which they are trained. Thus, Apple and other AI companies prioritize and use high quality writing, like copyrighted works, to train and fine-tune their models.
The market for licensing AI training data is growing rapidly. Licensing deals to use copyrighted works as training data between AI developers and publishers are regularly in the news. Nevertheless, Apple did not compensate creators for use of their copyrighted works and concealed the sources of their training datasets to evade legal scrutiny. Apple allegedly continues to retain private AI training-data, including pirated books, to train its future models in various datasets without seeking plaintiffs’ or class members’ consent or providing compensation.
This alleged conduct has damaged plaintiffs’ and class members’ intellectual property. It has deprived them of control over their work, undermined the economic value of their copyrighted works, and positioned Apple to achieve massive commercial success through unlawful means.