Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I've done scraping distributed over many IPs. I used luminati, bought X IPs, ran a bash script to download all IPs to a file, read the file in from python and spit out a new thread for each IP. Just used as a proxy, all activity was controlled by a single server.

Re Amazon, I've done that too and the way it works is you start up an EC2 instance that does the work and sends back info. In my case I sent the info to an S3 bucket and then pulled all the info from that bucket elsewhere once all the machines finished. Ultimately you pay for what you use, and if you only use an hour across a ton of machines and use the cheapest machine it's pretty cheap.

FYI: luminati will give you unlimited bandwidth for like $1/IP/month.



Worth noting that price is for datacenter IPs - residential ones are much more expensive and I’ve heard that IG is rejecting more datacenter traffic as of late.


If you get exclusive datacenter IPs it's much less likely to be rejected and it's not significantly more expensive.


Is this the company that offers a free VPN extension and pays for it selling those people machines bandwidth? I always thought their business model is way too shady to be legal.


They sold a majority stake at a valuation of $200MM and it's now a separate company from the one that runs the VPN.


Yeah. There's been numerous cases of people just grabbing a stolen credit card and getting a botnet to attack sites with.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: