The New York Times suing Microsoft and OpenAI

Sunday 14th January, 2024 - Bruce Sterling

https://nytco-assets.nytimes.com/2023/12/NYT_Complaint_Dec2023.pdf

(…)

Defendants’ unlawful use of The Times’s work to create artificial intelligence products that compete with it threatens The Times’s ability to provide that service. Defendants’ generative artificial intelligence (“GenAI”) tools rely on large-language models (“LLMs”) that were built by copying and using millions of The Times’s copyrighted news articles, in-depth investigations, opinion pieces, reviews, how-to guides, and more.

While Defendants engaged in widescale copying from many sources, they gave Times content particular emphasis when building their LLMs—revealing a preference that recognizes the value of those works. Through Microsoft’s Bing Chat (recently rebranded as “Copilot”) and OpenAI’s ChatGPT, Defendants seek to free-ride on The Times’s massive investment in its journalism by using it to build substitutive products without permission or payment.

3. The Constitution and the Copyright Act recognize the critical importance of giving creators exclusive rights over their works. Since our nation’s founding, strong copyright protection has empowered those who gather and report news to secure the fruits of their labor and investment. Copyright law protects The Times’s expressive, original journalism, including, but not limited to, its millions of articles that have registered copyrights.

4. Defendants have refused to recognize this protection. Powered by LLMs containing copies of Times content, Defendants’ GenAI tools can generate output that recites Times content verbatim, closely summarizes it, and mimics its expressive style, as demonstrated by scores of examples. See Exhibit J. These tools also wrongly attribute false information to The Times.

Case 1:23-cv-11195 Document 1 Filed 12/27/23 Page 3 of 69

5. Defendants also use Microsoft’s Bing search index, which copies and categorizes The Times’s online content, to generate responses that contain verbatim excerpts and detailed summaries of Times articles that are significantly longer and more detailed than those returned by traditional search engines. By providing Times content without The Times’s permission or authorization, Defendants’ tools undermine and damage The Times’s relationship with its readers and deprive The Times of subscription, licensing, advertising, and affiliate revenue.

6. Using the valuable intellectual property of others in these ways without paying for it has been extremely lucrative for Defendants. Microsoft’s deployment of Times-trained LLMs throughout its product line helped boost its market capitalization by a trillion dollars in the past year alone. And OpenAI’s release of ChatGPT has driven its valuation to as high as $90 billion. Defendants’ GenAI business interests are deeply intertwined, with Microsoft recently highlighting that its use of OpenAI’s “best-in-class frontier models” has generated customers—including “leading AI startups”—for Microsoft’s Azure AI product.1

7. The Times objected after it discovered that Defendants were using Times content without permission to develop their models and tools. For months, The Times has attempted to reach a negotiated agreement with Defendants, in accordance with its history of working productively with large technology platforms to permit the use of its content in new digital products (including the news products developed by Google, Meta, and Apple). The Times’s goal during these negotiations was to ensure it received fair value for the use of its content, facilitate the continuation of a healthy news ecosystem, and help develop GenAI technology in a responsible way that benefits society and supports a well-informed public.

1 Microsoft Fiscal Year 2024 First Quarter Earnings Conference Call, MICROSOFT INVESTOR RELATIONS (Oct. 24, 2023), https://www.microsoft.com/en-us/Investor/events/FY-2024/earnings-fy-2024-q1.aspx.
3

Case 1:23-cv-11195 Document 1 Filed 12/27/23 Page 4 of 69

8. These negotiations have not led to a resolution. Publicly, Defendants insist that their conduct is protected as “fair use” because their unlicensed use of copyrighted content to train GenAI models serves a new “transformative” purpose. But there is nothing “transformative” about using The Times’s content without payment to create products that substitute for The Times and steal audiences away from it. Because the outputs of Defendants’ GenAI models compete with and closely mimic the inputs used to train them, copying Times works for that purpose is not fair use.

9. The law does not permit the kind of systematic and competitive infringement that Defendants have committed. This action seeks to hold them responsible for the billions of dollars in statutory and actual damages that they owe for the unlawful copying and use of The Times’s uniquely valuable works….