Any publisher can opt out of google. Publisher also have substantial control over titles and snippets shown in google, whether an article appears in google news, etc
Paraphrasing is also known as cloning and is often a copyright violation
Copyright law doesn't mention opt outs or search engine snippet controls. It's not clear to me that robots.txt is the singular thing that makes Google legal.
In US copyright law facts cannot be copyrighted, so copyright on factual content like newspaper articles is limited. Simply replacing a few words wouldn't work, but I am certain that GPT-4 is capable of paraphrasing factual content at a level that would not be considered infringement if a human did it.
If I make a website that scrapes NYT and passes it back and forth through a machine translator, say, English -> Spanish -> English, then the content will be slightly modified. Is this legal to make money off of?
Seems like the legal answer is unclear but, like Napster, such a system seems like it would lose in court.
It would be unlikely to be something you'd find paying customers for, though? I suppose if you charged a small percentage of what NYT charges people might be willing to consider it, but you'd have some costs for hosting etc., so I am skeptical about its viability as a business model...
I'd serve fake news en-masse to low IQ people who click things to feel good about their own views. I'd also build a handful of websites (ideally as many as I can personally manage) to flood the Internet with fake news clickbait.
One site clones fox news. One clones news max. And so on, cloning many news sites, sports sites, any news site. Automated, massive scale content farming. Think of the websites recommended by Taboola but, realistically, a whole lot worse.
That’s not the only reason. Google search is also transformative and non competitive with the underlying publications. And that is why the opt out is important. If you feel google competes with your site you don’t have to sue Google: just tell them to to away
Transformative yes, so is ChatGPT. Much more so actually. Non-competitive is debatable. Especially with the instant answers Google has in addition to regular snippets which can also obviate the need to visit a site. I have a hard time seeing ChatGPT as competing with newspapers more than Google Search does.
Nobody is seriously going to ChatGPT and trying to trick it into regurgitating old NYT articles as an alternative to paying for access to NYT's archives. Meanwhile, newspapers went as far as getting the laws changed in several countries because they felt Google was competing with them too much and didn't like the fact that it was legal.
>Copyright law doesn't mention opt outs or search engine snippet controls. It's not clear to me that robots.txt is the singular thing that makes Google legal.
Genuinely - what are you talking about besides your own assumptions? you just assume everything google does is legal and therefore any one else doing anything arguably similar must also be legal? Without regard for factual details that do matter to copyright law? Such as license?? Your own description of copyright law here is very stunted - you can't paraphrase articles of the NYTimes and call it a fair use. You can report on what the NYtimes reports on... because that's what news is.
Can they? Here's reference to a legal fight where Google scraped song lyrics from a lyrics website, and presented the lyrics verbatim directly to users (bypassing the original site and the ads that allowed that site to operate)
Paraphrasing is also known as cloning and is often a copyright violation