I don’t doubt it does. It’s easy to get it to spit out long answers from Stack Overflow verbatim, I’ve done it. Maybe some of the “transformative” nature of the LLM output is the removal of any authorship, copyright, license, and edit history information. ;) The point here is to supplant Google as the portal of information, right? It doesn’t have new information, but it’s pretty good at remixing the words from multiple sources, when it has multiple sources. One possible reason for their legal woes wrt copyright is that it’s also great at memorizing things that only have one source. My college Markov-chain text predictor would do the same thing and easily get stuck in local regions if it couldn’t match something else.