And in a lawsuit, there's very much the question of intent as well.
If OpenAI never meant to allow copyrighted material to be reproduced, shut it down immediately when it was discovered, and the NYT can't show any measurable level of harm (e.g. nobody was unsubscribing from NYT because of ChatGPT)... then the NYT may have a very hard time winning this suit based specifically on the copyright argument.
Intent isn't some magic way to claim innocence. Here negligence is very much at play. Were OpenAI negligent when they made the NYT articles available like this?
Sure, but even negligence may be hard to show here.
It's very clear that OpenAI couldn't predict all of the ways users could interact with its model, as we quickly saw things like prompt discovery and prompt injections happening.
And so not only is it reasonable that OpenAI didn't know users would be able to retrieve snippets of training material verbatim, it's reasonable to say they weren't negligent in not knowing either. It's a new technology that wasn't meant to operate like that. It's not that different from a security vulnerability that quickly got patched once discovered.
Negligence is about not showing reasonable care. That's going to be very hard to prove.
And it's not like people started using ChatGPT as a replacement for the NYT. Even in a lawsuit over negligence, you have to show harm. I think the NYT will be hard pressed to show they lost a single subscriber.