I don’t really buy this argument. It assumes companies like OpenAI are incentivized to be unselective about training material. Instead what we see is things like making deals with Reddit for known valuable data. I don’t think any AI operator is training on brand-new unvetted so spam websites by default now.
Oh god. Reddit has a select amount of good data compared to other markets. But if you actually read through a thread, you will find absolutly random things upvoted that stray into the zeitgeist.
If you givw a valid opinion in the wrong subreddit you get muted. The inverse is also truth. You arw using a filter these AIs dont.