If you're interested in a publicly queryable index of the web, you could try run...

Eridrus · on May 14, 2015

I had a crawling project where I wanted to get a sense for a few ad-related things on the internet and came upon common crawl and was initially excited since I thought it would have incidentally captured the data I wanted, but I was disappointed to find that they did not do any kind of JS execution, which limited the effectiveness for me pretty drastically.

th0br0 · on May 14, 2015

I'd never heard of Common Crawl before but it looks like an awesome project! Keep up the good work!

chrischen · on May 15, 2015

How up-to-date is commoncrawl data?