Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Google built its business and its current $500+ billion status on stealing other websites' content and storing it on their servers.

Can't anyone exclude their pages from being scraped by google if they so desire?

If you post something on a site that allows itself to be scraped, only then it will go on google.



Practically, you can't, because Google has a quasi-monopoly on attention.

See also https://en.wikipedia.org/wiki/Network_effect


My point was that it is misleading to say that Google is "stealing" content because they are only indexing pages that don't opt-out of being indexed.


Or if anybody else posts anything about you on such a site.


Trivially with robots.txt, which is standardized and so well known among anybody doing webservers that not having such a rule ought to be considered as consent.

About the same as the rule here in Sweden on filming - you're free to film in any private venue until told not to.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: