I've done scraping distributed over many IPs. I used luminati, bought X IPs, ran a bash script to download all IPs to a file, read the file in from python and spit out a new thread for each IP. Just used as a proxy, all activity was controlled by a single server.
Re Amazon, I've done that too and the way it works is you start up an EC2 instance that does the work and sends back info. In my case I sent the info to an S3 bucket and then pulled all the info from that bucket elsewhere once all the machines finished. Ultimately you pay for what you use, and if you only use an hour across a ton of machines and use the cheapest machine it's pretty cheap.
FYI: luminati will give you unlimited bandwidth for like $1/IP/month.
Worth noting that price is for datacenter IPs - residential ones are much more expensive and I’ve heard that IG is rejecting more datacenter traffic as of late.
Is this the company that offers a free VPN extension and pays for it selling those people machines bandwidth? I always thought their business model is way too shady to be legal.
Re Amazon, I've done that too and the way it works is you start up an EC2 instance that does the work and sends back info. In my case I sent the info to an S3 bucket and then pulled all the info from that bucket elsewhere once all the machines finished. Ultimately you pay for what you use, and if you only use an hour across a ton of machines and use the cheapest machine it's pretty cheap.
FYI: luminati will give you unlimited bandwidth for like $1/IP/month.