Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I am unfamiliar with the details of a robots.txt file. Is it possible to specify "I only want bots x, y and z to crawl my site"?



How difficult is it for a bot to lie?

Though if you've made it clear that only x, y and z can crawl your site, and someone spoofs, say, y, then it would be easy to demonstrate that someone has done something they know they shouldn't.


incredibly easy.

and not only can the bot lie, it can disregard the robots.txt file altogether. just like the terms of service document for humans, you can choose to disregard it & deal w/ the consequences (blocked IP's, lawsuit, etc).

robots.txt is just a version of the TOS that computers can read.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: