Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Your initial comment made sense after reading through the openai docpage. so I opened up my site to add those to robots.txt, turns out I had added all 3 of those user-agents to my robots file [0], out of curiosity I asked chatgpt about my site and it did scrape it, it even mentioned articles that have been published after adding the robots file

[0]: https://yusuf.fyi/robots.txt



Can you share that ChatGPT transcript?

One guess: ChatGPT uses Bing for search queries, and your robots.txt doesn't block Bing. If that's what is happening here I agree that this is really confusing and should be clarified by the OpenAI bots page.



Yeah, wow that's a lot of information for a site that's supposedly blocked using robots.txt!

My best guess is that this is a Bing thing - ChatGPT uses Bing as their search partner (though they don't make that very obvious at all), and BingBot isnt't blocked by your site.

I think OpenAI need to be a whole lot more transparent about this. It's very misleading to block their crawlers and have it not make any difference at all to the search results returns within ChatGPT.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: