This is so spot on. It's been some years I've been thinking that given that nowa...

zidel · on July 4, 2021

I've played around a bit with DHT indexing recently and a very simple python program using libtorrent to send sample_infohashes (BEP51) and download metadata (to get names/files) was enough to get me 1-2 .torrent files per second without any special effort or aggressive settings. The bottleneck (by 10x) has been the embarrassingly parallel info hash to .torrent step, so speeding things up shouldn't be very hard.

After running it sporadically for a few months I ended up with 1.4M torrent names and 30M info hashes, but I never put any work into estimating the size of the DHT, so I don't know what sort of coverage that represents.