I wasn't aware. Can you please update Wikipedia then: https://en.wikipedia.org/w...

phit_ · 2025-02-07T13:08:27 1738933707

their own docs also specify that the robots.txt does not stop indexing or showing up in search, they even bolded it "it is not a mechanism for keeping a web page out of Google"

https://developers.google.com/search/docs/crawling-indexing/...

threeseed · 2025-02-07T20:55:10 1738961710

From the docs:

“While Google won't crawl or index the content blocked by a robots.txt file”

They will show the URL if someone else has linked to it. But the content itself is not indexed.

alphan0n · 2025-02-07T13:48:03 1738936083

The only way for links to appear in a Google search would be to host a public resource, that is linked from another public resource.

If you have specified in your robots.txt that you do not want the page(s) or directories ingested then only the url is indexed (if it is linked from another page). It does prevent the public display of the content of a page and creation description/summary.

https://support.google.com/webmasters/answer/7489871?hl=en

nottorp · 2025-02-07T12:50:35 1738932635

It must be nice to believe everything people say by default... ;)