Did you actually read your link? That's not at all what it says.

tao_oat · on Oct 3, 2021

To be clear, stopped supporting robots.txt noindex a few years ago.

Combined with the fact that Google might list your site [based only on third-party links][1], robots.txt isn't an effective way to remove your site from Google's results.

Sorry, could have been clearer.

[1]: https://developers.google.com/search/docs/advanced/robots/in...

gnabgib · on Oct 3, 2021

This page has a little more detail: https://developers.google.com/search/docs/advanced/crawling/...

"If other pages point to your page with descriptive text, Google could still index the URL without visiting the page. If you want to block your page from search results, use another method such as password protection or noindex. "

dd82 · on Oct 3, 2021

>noindex in robots meta tags: Supported both in the HTTP response headers and in HTML, the noindex directive is the most effective way to remove URLs from the index when crawling is allowed.

Seems clear enough to me

jen20 · on Oct 3, 2021

Quote from the linked article:

“ For those of you who relied on the noindex indexing directive in the robots.txt file, which controls crawling, there are a number of alternative options:”

The first option is the meta tag. It does mention an alternative directive for robots.txt, however.

ghassanmas · on Oct 3, 2021

What about the blocking google bot by their IPs, also combined with user-agent wouldn't that stop the crawlers

Google crawlers IPs https://www.lifewire.com/what-is-the-ip-address-of-google-81...

benatkin · on Oct 3, 2021

That will stop the crawlers but you could still show up in the search results, because of other web pages. From GP:

> If other pages point to your page with descriptive text, Google could still index the URL without visiting the page