Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Thanks!

I wonder where this discrepancy comes from




probably under-indexing of non-english sources by these crawlers.

would be interesting if yandex opened some data sets!


And lots of people write on the web using English as a second language, which both reduces the presence of their native language and increases the presence of English.


yep not a native english speaker here and yet my online footprint is mostly english due to software pushing me to learn it


My guess is that reference counting at depth=1 only capture non-$LANG content which text parts don't matter a lot, e.g. photo galleries.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: