We are looking for another full time engineer to join our data platform team. This team manages everything from extracting data to delivering insights. We also help other teams answer their questions with data. If you are a data engineer with interests in data science, this would be a great fit for you as we do a little bit of everything.
Yeah, as far as I can tell they removed the 300 MB limit from the limits page less than an hour ago; I'm sure they're still cleaning up the old limits as we speak.
We have a pretty big e-commerce scraping project, and we don't really run into many problems regarding blacklisted ip's. There are a few sites who we get consistent bans from, but with elastic ip's it's pretty much a non-issue. I have yet to see a site that ban's all amazon ip's.
So maintain a blacklist of elastic IPs. If it's too big for you, make it a community effort.
Those are bad reasons to close your site to all of AWS.
As nupark2 mentioned, there are legitimate users routing traffic through EC2, even some bots that you'd want to visit your site. Archive.org comes to mind (many of there scrapers are or were behind AWS). Closing your site or app to a large swath of the web is the wrong solution. It's like killing a spider with a bazooka.
Unlike the assumptions you're limited to making, I know how much of my AWS traffic is human, and it's really very very very small. The sad reality is I'm sick and tired of rogue bots, and the tiny sliver of collateral damage can fill out the CAPTCHA validation every so often.
(I also blacklist GWS, rackspace, linode, softlayer, reliablehosting, ovh.net, node4, netdirect, layer42, all TOR exits... it's actually a pretty huge list.)
Completely unrelated to dotcloud, but HN has no "forgot my password" function that I could ever find when I needed it. On that I'll agree that having one is super useful.
We are looking for another full time engineer to join our data platform team. This team manages everything from extracting data to delivering insights. We also help other teams answer their questions with data. If you are a data engineer with interests in data science, this would be a great fit for you as we do a little bit of everything.
Stack: Python, Spark, AWS (EMR, S3, EC2 etc), Git, Jupyter, Parquet, Mysql
You can apply here: https://jobs.lever.co/coupa/5cdd0957-37cd-4419-8537-de60b3bd...