> If I scrape the web, is it legal to train a transformer on it? Why or why not?
At no point did I say anything about hosting a mirror of the NYT website, with free articles. Obviously. Because OpenAI didn't do that. Some NYT lawyer tried to get ChatGPT to write a NYT article. Maybe first they should have actually done a Google search and shut down some of the actual content farms which simply copy NYT content such as [0]. But instead, we get this.
> Is it legal or not to scrape the web?
> If I scrape the web, is it legal to train a transformer on it? Why or why not?
At no point did I say anything about hosting a mirror of the NYT website, with free articles. Obviously. Because OpenAI didn't do that. Some NYT lawyer tried to get ChatGPT to write a NYT article. Maybe first they should have actually done a Google search and shut down some of the actual content farms which simply copy NYT content such as [0]. But instead, we get this.
[0]: https://salaminv.com/news_file/