Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Tell HN: I DDoSed myself using CloudFront and Lambda Edge and got a $4.5k bill
274 points by huksley on June 28, 2022 | hide | past | favorite | 333 comments
I am using awesome NextJS and serverless-nextjs and deploy my app to CloudFront and Lambda@Edge.

I made a mistake and accidentally created a serverless function that called itself. In a recursive loop, with a 30s timeout. I thought I fixed it and deployed the code to the dev environment.

I have had an AWS Billing alert (Budgets) set up to prompt me when my monthly budget goes over $300 (my usual bill is $200/month).

Imagine the terror when I woke up the next day to see the AWS Billing alert email saying I already owed $1,484! I removed a function and deployed it again in 30 minutes, but it was too late. It has already run for 24 hours, using over 70 million Gb-Second!

Only after that I've learned that AWS Billing alerts do not work this way for CloudFront. You get delayed information on charges because they collect them from all regions.

On the following day, the bill settled at a shocking $4600. This is more than we have ever spent on AWS all time.

CloudFront includes the AWS Shield Standard feature, but somehow, it was not activated for this case (Lambda@Edge calling itself via CloudFront).

Now, I understand that I should have created CloudWatch alarms, which would alert me when the number of requests exceeds the limit. The problem is, that they need to be set up per region, and I got CloudFront charges from all points of presence.

I am a big proponent of the serverless approach. It makes it easy to scale and develop things (e.g., you get PR review version branches for free, both frontend and backend code like Vercel does). But now, I am unsure because such unexpected charges can ruin a side-project or emerging startup.

Now I am waiting on a response from AWS Support on these charges; maybe they can help me waive part of that.

What is your experience with it? Would you recommend to use to build a new product if you are bootstrapped, 3-person startup?



I'm very much on the boring technology side of things with respect to hosting.

40€ / month gets you a very powerful dedicated server that can easily handle millions of requests per day and performs incredibly well and can be managed easily.

If you also use containers you even get quite a bit of flexibility and agility.

To be honest I don't really understand the sentiment that developers can get away with not knowing basic sysadmin stuff and at the same time have to spend relevant amounts of time, energy and money to get up to speed with cloud solutions, k8s and so on.

But then again, I'm not one of the cool kids...


> To be honest I don't really understand the sentiment that developers can get away with not knowing basic sysadmin stuff and at the same time have to spend relevant amounts of time, energy and money to get up to speed with cloud solutions, k8s and so on.

Agreed. I hear some saying it's nice to deploy on a lambda because you don't need to know anything about the runtime environment. But it's never quite true. As you say, you do have to become an expert in all the intricacies of these proprietary deployment environments to get the best out of it and avoid getting burned. So an amount of effort has to be spent, anyway.

But the drawback is that now you spend this effort on learning what is after all a proprietary product of AWS. While AWS is massively popular, the knowledge doesn't translate to anywhere else so it locks you in. If you spent similar time learning the basics of Linux deployment and administration, your knowledge is lower level and more general.

But where it really gets you is when things go wrong and you need to dig deep to diagnose. Can I attach a debugger or watch socket traffic or run bpftrace or do any kind of diagnostics at all on that lambda instance? Oh sorry no, good luck.

Unless your budget is so low you can't afford a $5 VPS and your traffic is so low that your lambda bill will never reach $5, you're really better off deploying on a VPS that you control and can debug. Keep it simple.


> But the drawback is that now you spend this effort on learning what is after all a proprietary product of AWS. While AWS is massively popular, the knowledge doesn't translate to anywhere else so it locks you in. If you spent similar time learning the basics of Linux deployment and administration, your knowledge is lower level and more general.

Exactly, I learned Linux in late 90s. There are a few new softwares like nginx instead of Apache but I can bring up a new VPS and set it up using essentially same knowledge from late 90s. With a CDN and caching, you really do have a massively scalable service.

Don't get me wrong, I love Cloud but I will never use it for personal projects. It just feels like renting vs owning. My personal projects will have bugs, and I don't want to be on hook for thousands of dollars.

Learn Linux/Sysadmin once, and you can likely use that knowledge forever.


Could not agree more. I have multiple VPS servers running for personal email, file storage, ci, and some side projects. Maintenance and management is really low - everything is backed up to S3 and if things go down, it will take an hour or so to restore it, but I can tolerate that kind of downtime as long as data is not lost. I think cloud platform is easier to sell once you need a robust HA.

As for performance, one of my projects can sustain 300rps which is more than enough. Sure it cannot scale automatically to thousands and millions but I don't think many projects have that requirements.


VISA has 2000 rps for all payment stuff. So it is what it is :)


I couldn't agree more.

On top of that, cloud tech can't be "set and forget" because of deprecation of apis and services.


I'm sure we weren't the only ones bitten by that in the last few years, when AWS deprecated node 12 and then .NET core 2.1. Yes there were warning emails etc. and there's good reason to keep up with using the latest tools etc. where possible, but at some point you'd like to think "this component is stable, no need to touch it again".


Servers can also be liability. You need to document, implement and maintaing hardening, have a process for regularly patching os and apps, monitor logs, have backup and disaster recovery procedures, regularly test the procedures, figure how to implement data encryption at rest, implement high-availability and so on.

Good platform-as-service can solve many things for you and let you focus on the core thing you are providing.

Obviously not everybody needs to worry so much about the stuff mentioned above. If you are providing SaaS solution, there’s a good chance some customer will start asking these questions as part of their procurement process.


I agree with you in general, that it's best not to solve problems you don't have, but it's worth considering what exactly do the alternatives offer, as you might end up trading one kind of complexity or maintenance burden for another.

Fixed workload, shared dev boxes, all kinds of small-scale operations fit very well onto a bare metal / VPS box in someone else's rack. The maintenance work often boils down to setting up cron with unattended upgrades and borgbackup, Docker with Traefik+ACME as the web frontend, etc.

PaaS solutions will often pull you in the direction of getting "addons" or using extra paid services for every component you might need: a relational database, a cache, a queue, a load balancer... Which are usually a very good call if you need to scale up, but which could've all been entries in your docker-compose.yml if that's all you needed.

Most importantly, and regardless of your hosting strategy, you can't just dismiss security, disaster recovery, or capacity planning. Clouds have outages and bottlenecks too, and no product can save you from human error.


> You need to document, implement and maintaing hardening, have a process for regularly patching os and apps, monitor logs, have backup and disaster recovery procedures, regularly test the procedures, figure how to implement data encryption at rest, implement high-availability and so on.

Hey, that's not exactly a correct comparison. You don't simply get half of those in the cloud either. Log monitoring and disaster recovery is something you have to figure out yourself, the best clouds have are some foundations to build upon, and possibly - some cookiecutter template that might fit your use case (if you're really lucky it'll even be decent). And you can get same stuff on traditional servers, just with different pre-baked solutions (which also may or may not fit a particular use case and may vary from perfectly good to quite crappy).

People love to brag about all the features (most not needed for your casual website), but somehow no one tells the fact that those features just won't be there when you'll start to use the cloud - because you have to be actively aware that you need them, explicitly enable some, and explicitly spend time learning, setting up and testing others. Unless we're talking about PaaS (and not a "classic" cloud like AWS, GCP or Azure), you still need someone with some sysadmin experience - except that this person must wear a different kind of sweater (with $Cloud logo rather than Tux or Beastie).

All you get is some hardening an OS + managed software (like LB servers and databases) patching. Which is something that's not that hard to do on a self-managed server (well, the software updates part; hardening is a rabbit hole). But not application patching, mind you - that's your responsibility to maintain your app, the very best it can do is to run a security audit (which you can get as a service separately). And even though managed databases are tuned (still a lot of manual tuning to do if you want the engine to truly purr) and maintained they aren't all that fun and peaches the marketing materials say - sometimes you just have to e.g. spin up your own self-hosted PostgreSQL to perform the tricky migration, then replicate it back to a managed solution.


> You need to document, implement and maintaing hardening, have a process for regularly patching os and apps, monitor logs, have backup and disaster recovery procedures, regularly test the procedures...

There are many hosts out there who will take care of _all_ of that for you. You just need to install your stuff on the box, and they take care of the rest. Logs and monitoring... you'll prolly want something custom anyway rather than whatever anyone else offers.

> figure how to implement data encryption at rest,

you have to do that anyway

> implement high-availability and so on

again, many hosts who provide easy ways to manage that for you.


You'd be surprised. I've worked at many startups and mid-sized companies. You don't actually "need" to do most of those things. I've did a consulting job for a company with 100's of millions in revenue that didn't patch its servers for years. One had a 1400 day uptime!


> didn't patch its servers for years. One had a 1400 day uptime

Well, this isn't exactly the best practice to follow - but, yes, this is very common, and it works (until someone finds that there's a vulnerability to exploit). So, from a purely practical perspective - software updates are certainly not a hard requirement, one can do quite well without them (just be aware about the risks).

Anyway, there are many fairly age-tested solutions for updating software on the fly, live kernel patching and live network sockets included.

And a lot of cloud software updates (e.g. managed databases) incur noticeable downtime so you need a failover (and appropriate software design that can survive it). But then you can do the very same thing with two servers, no cloud magic necessary.


Hardening servers isn't as hard as it used to be. Ansible does a pretty good job and there are several examples of using it to harden your server. If you are using a VPS .. you could use packer and start with a fleet of servers based on your custom image.


Oh, I don't disagree, I just meant it can be as complex as you want it to be. Hardening is just a rabbit hole that can go on almost forever - it's never really final until you declare that enough is enough, typically there's always something more one can do, if they have time and capacity.

Also, out-of-box distributions those days can be said to be already "hardened" compared to what we had 15 years ago.


> Servers can also be liability. You need to document, implement and maintaing hardening, have a process for regularly patching os and apps, […], have backup and disaster recovery procedures, regularly test the procedures, […] and so on.

Much of this complexity I eliminate by using the Immutable Server pattern:

Specifically I deploy my app as a Docker container hosted on a virtual machine cluster managed by AWS ElasticBeanstalk. OS upgrades and patches (outside the container) are done by AWS.

If anything goes wrong with a VM I just terminate it and a different fresh VM spins up to take its place.


This is that I was thinking doing also - switch to using Elastic Beanstalk.

What do you do if you need to launch another instance of the app? For staging or testing.


I'd rather worry about (and fix) those technical problems then having to deal with possible billing-pocalyse.


Also the cost of those services vs. a few well-configured basic VPS hosts is nuts. I was shocked to learn that to get MySQL/Postgres in GCP it's often >$40/month - even for the insanely tiny instance sizes. We're talking for like 20G storage...

In contrast I've ran some pretty hot-and-heavy MySQL/Postgres databases on basic DO/Linode VPS's for half of that cost with much success. I get these cloud tools give a ton of "out of box" features - but you are paying for it at the end of the day.

Anecdotally I've noticed a huge shift away from general devs/engineers having general CLI/Linux/sysadmin knowledge.


If you had no knowledge on standing up a DB server (installing, configuring, monitoring, logging, access control, and backups, restricting network access), how much time would that take you to learn? That and your VPS requires setup for access control, patches, updates, monitoring, logging.


Point taken, upvote given, but for a lot of people standing up a standard redundant DB config for MySQL and Postgres isn't that big of a deal.

I say this as an engineer who has a ton of relational DB experience. Before cloud providers w/DB options were so plentiful devs/engineers/sysadmins would have to set this up by hand.

---

> If you had no knowledge on standing up a DB server

My point is, "but yea what if you do?" There's actual value in that I've found, especially when I'm looking to save pennies at the early-stage of a startup while in the customer aq phase. Many folks have expertise in those areas without needing a cloud provider.

I do get the "but all of these things!" argument (access control, patches, updates, monitoring, logging) but often those are easily solved/solvable problems even without leveraging cloud offerings. I can get very far along with an ELK/TIG stack + uptime robot and basic VPS networking features... all for a very affordable monthly bill on the infra.


Starting a db on a web-server and running apt update/upgrade are fundamental parts of working on the web.

If you or any of your developers don't know how to do that, you/they should seriously seriously spend the 40 minutes it takes to learn.


I feel like that’s where you separate the pre-cloud and post-cloud devs.

It’s not hard to imagine a dev being brought up 5 years ago into a React world, with JS in the backend and something like Vercel or Netlify to deploy their app.

Someone like that might be entirely uncomfortable when dumped into a plain old Linux server, and would definitely take longer than 40 minutes to even get their head wrapped around what’s going on.

That’s not a knock on them - they get their job done, and to the business that’s all that matters. But yeah, just trying to, I don’t even know, empathise? Try to understand? Who knows. It’s just a different world now.


Symapthy, is more like it. Theyve been convinced by the cloud rental services that it's different now and too hard, when it's really not because all that SaaS is still running on a webserver. There's just a new gate keeper with a huge marketing budget that charges by the minute now.


> and would definitely take longer than 40 minutes to even get their head wrapped around what’s going on.

If a grown up person (who is presumably making a good money in their React/JS thing) can't figure on how to install a web server in 40 minutes (with all SO, Google and a bazillion of articles on how to do so) then...

Like come on, twenty years ago you still needed to get your ass off the chair and go to the bookstore, find and buy a proper book and actually read it. Now you just need to type "Ubuntu 20 install webserver" in to the search and voila.


The same amount of time if I would need to "stand up" a rented DB, including figuring out a bazillion of providers and options, configuring proper instance sizes, turning needed knobs for backups to work... Or you suppose what a rented service doesn't need anything at all?


Serverless is almost always a trivial cost compared to VPS. With lambda, you get like 1M free invocations per month and the price per ms is `$0.0000000017` for 128MB on ARM.

If your endpoint takes 10ms to run and you're running it constantly for a month, it will cost you less than $0.05 (it will actually be about half of that because you get the first 1M invocations free each month, but ignoring that...).

As you point out, running RDS is more expensive, but RDS isn't serverless. You would probably want to look at DynamoDB for a serverless database, which is considerably more affordable than RDS.

Moreover, the serverless stuff is considerably easier to configure and operate than a VM.


A 10ms invocation sounds like a myth. I've certainly never seen one that low for a lambda that actually does anything. All the serverless stuff I've worked with (mostly in Python) operates slower than a CGI script. 100's of milliseconds minimum. When your serverless endpoint needs to integrate with other AWS services, the configuration can get complicated fast. VPC endpoints, IAM roles, security groups... it goes on and on.


If your lambda runs very infrequently, you'll see more cold starts which will take hundreds of ms, but warm starts are pretty easily in the single-digit milliseconds (at least for a Go lambda). If you're doing a bunch of compute or sync I/O in a loop then your functions will take longer.

But less frequent invocations are actually an even better case for Lambda versus a VPS because it suggests less wasted time (yeah, you have to pay for cold starts, but with a VPS you're paying for all of that time your service is up irrespective of whether or not it's being used).

> 100's of milliseconds minimum. When your serverless endpoint needs to integrate with other AWS services, the configuration can get complicated fast. VPC endpoints, IAM roles, security groups... it goes on and on.

This is all true for a VPS as well. Assuming you care about reliability, you need to run multiple instances of your VPS behind a load balancer which implies a fair bit of networking. Moreover, you need to manage the hosts themselves, so configuring log aggregation, metrics collection, process management, SSH, deployment, hardening, etc, etc, etc.

Further still, your compute will need to communicate with the comparable services as in your lambda hypothetical, so you still need to deal with IAM and some more networking stuff. Maybe you'd say "gotcha! I would just run my databases on my VPS instances!" which is cool, but now you need to configure monitoring, backups, replication, failover, and so on for your databases versus using DynamoDB (and I would bet a lambda/dynamodb workload than the same workload on a reliable VPS/whatever-database stack).

Of course, if you're running a hobby blog or something that doesn't need reliability of scalability, then a $5 VPS is probably fine (maybe even an S3 bucket behind a CloudFront distribution, which would likely be free).


Yes, cold starts are definitely a problem. One of the advantages of VPSes is its all hot, ready to go, obviously. You pay for that performance, for that lower latency, and I'm fine with that.

My big problem with lambda / serverless is the developer experience is pretty awful. The time between making a change, deploying, and seeing the result of that change is slow. You can work around this (with tools like localstack), but it's often not close enough to the real environment. You'll still waste tons of time debugging permissions issues when you do a real deploy.


Yeah, it depends on the application. If you're very latency sensitive, then you'll probably want to keep your lambda warm or just pay more for VPS. But the original context was cost, not latency.

> My big problem with lambda / serverless is the developer experience is pretty awful. The time between making a change, deploying, and seeing the result of that change is slow. You can work around this (with tools like localstack), but it's often not close enough to the real environment. You'll still waste tons of time debugging permissions issues when you do a real deploy.

I haven't had much of a problem. Once I figure out the shape of the inbound payload, it's pretty easy to test locally. I can't think of any reason the runtime environment would be an issue. Debugging IAM is tedious, but I just created one Terraform module that I use for all of my functions so I don't have to slog through the IAM stuff every time I make a function.


I personally have not had a lot of problems with local lambda development. However, when assisting coworkers, I often see problems where dependencies aren't packaged properly (lambda layers), permissions are missing, code is not properly modularized (everything in one big handler file), people editing code live on AWS to debug problems they can't replicate locally, etc.


> Moreover, the serverless stuff is considerably easier to configure and operate than a VM.

Yep - and to be 100% candid on my current project I'm running 100% with GCP Functions, AppEngine Task Queues, and Firestore for persistence. I'm also AWS certified and have worked as a DevOps engineer in a PCI-regulated environment leveraging things like DyanmoDB, Lambda, etc.

Cool part about running that way is my current billing is near zilch. Woo! Totally on-board with what you're saying... literally living it now =)

I'm just saying that there's a ton of ways to skin a cat. And, personally I wouldn't paint with as large of a brush to say "serverless stuff is considerably easier to configure and operate than a VM." A lot of times this rings true, but I anecdotally wouldn't say it is always the case.

I'm being pedantic, but I'm also trying to make a case that not always do cloud design patterns/"best-practices" make sense.


> Serverless is almost always a trivial cost compared to VPS.

Not my experience. Our AWS lambda cost is close to $1K/month and the usage isn't that much so we could easily host this on a few (for redundancy) of the smallish VPS instances for far less money.


For batch workloads (not latency sensitive, can run at or near full capacity, etc), but most of the time Lambda is more affordable. You'll need to say more about your workload though.


Take for example OS and app upgrades. Quite often you have few requirements, for example 1)all updates should be first tested in test env 2)updates need to be installed in timely manner 3)critical updates need to be installed quickly (30 days is too slow)

When you start thinking these, they are not so easy. (1) means you can’t just run “apt upgrade” on every server - you need to manage the updates to make sure they get tested first. (2) is kind of ok, but requires some work on regular basis (at least checking things) (3) means you need to monitor the updates for your stack and classify them. You can get feeds for example for Ubuntu, but does that cover whole stack. And checking these weekly (or daily) actually gets boring.

All this stuff can be done, but IMHO it is time consuming and takes time from more important things. The weekly/monthly JIRA tickets for checking x, y and z get quite annoying when you also have n other things to finish. Then you start slacking on them and feel the pain when trying to collect evidence for the next procurement process check.

If you have tens or hundreds of servers and can have a separate infra team with professional sysadmins then this is all fine. My rant is mainly for small teams, where same guys are supposed to develop and run things.


Small teams should just configure apt unattended upgrades and call it a day. I do this with my personal server and haven't had any issues for years.


You should probably reboot it then.


I'd rather have to to plan and budget than have to deal with patching (after testing the patch) every piece of software on a system the moment a patch comes out to avoid an undetectable rootkit being installed during that gap


An occasional $4K charge is negligible for a business (almost everyone is better off dealing with this sort of overrun than troubleshooting servers, all else equal), and anyway this is a particularity of AWS CloudFront's billing protections rather than a fundamental flaw with serverless.


In most countries in the world, $4K charge is absolutely not negligible.


I doubt this is true. If you're operating a tech company anywhere in the world, I'm guessing $4K is pretty close to negligible. In whatever case, indexing by country is misleading, "tech companies" in general is more interesting, if there are a handful of tech companies in the third world for which this doesn't hold, it doesn't invalidate the generalization.


In SMB companies around the world "an occasional $4K charge" is definitely not negliable.


Again, we’re not talking about SMBs, we’re talking about tech companies generally. (What share of SMBs are there doing serverless or with any cloud footprint at all?)


I ran a website with 37 million users and 3.6Gbps peak bandwidth (JavaScript+thumbnails, no video) from my own two racks of Linux servers thay i have not systematically updated for years. The OSes were beyond the lts support winows. I manually compiled my own updates, but very rarely and only those that i deemed critical. Granted, the site stack was completely custom so the standard automated hacks didn't work. In 15 years there were no accidents.


I dont think this should be standard. It sounds like you’re saying you ran a service with 37 million people’s information on a software stack that was so old that not even the vendor is supporting it anymore and could be riddled with security issues that you wouldn’t even know much less be able to detect? It may work but certainly not going to get any security certifications this way..


I hardly stored any PII. Also, this was a high profile site that you know: if it was hacked, they would probably try to deny access and extort: it would make a lot of monetary sense. We would also have lost our merchant accounts very quickly (although CC numbers were not stored, only MD5s, but i suppose they could have been captured from the application's memory after TLS decryption but before MD5 hashing, although this would have been a difficult task even on a rooted server). I am 99.999% confident by observing and recording loads, traffic, logs and other parameters of all servers that there were no intrusions. I understand that there's market size and appeal for bureaucracy and certifications, but i have my own data.


MD5(CC#) is trivially reversed, so I hope that’s not what you were doing.


Omg. cft should not be allowed anywhere near credit card numbers.


Do you save logs? If yes, there's your PII.


Thankfully, the servers were not in Europe! I said "hardly". It's genuinely amusing how people downvote facts, simply because they don't fit the zeitgeist.


Nobody mentioned Europe, except you. If you are referring to GDPR this still applies if you host outside of Europe.

We disagree webserver logs contain 'hardly any PII'. You may find it insignificant PII, but it does not constitute 'hardly'. IMO its a circumstantial goldmine.


In the US at least one court found that IPs were not PII

https://www.huntonprivacyblog.com/2009/07/10/washington-cour...

I am curious who "we" are- a royal plural?


And in EU, it is.

We referred to you and me.


Not to mention that a successful compromise could be used to serve malware to the users.


With unattended-upgrades you might get away with such behavior. Also, spme webservers have a great track record. Then its the question if your other services are secure, and your JS/CSS.


Nginx and apache (upgraded from 2.2 to 2.4 once)


> Servers can also be liability. You need to document, implement and maintaing hardening, have a process for regularly patching os and apps, monitor logs, have backup and disaster recovery procedures, regularly test the procedures, figure how to implement data encryption at rest, implement high-availability and so on.

That is true.

> Good platform-as-service can solve many things for you and let you focus on the core thing you are providing.

I don't see how with platform-as-a-service solutions you can skip documentation, maintenance, log monitoring, disaster recovery, backups, test procedures, or encryption at rest.


you can get servers with managed hosting, still at a fixed rate each month


Reading some replies here, it’s no wonder some startups go bankrupt while investing so much in infrastructure — all while not having enough users to justify more than a single dedicated server.

Scalability should be the last thing in mind if your product sucks.


My frustration with replies like this one is assuming that everyone here is working on personal projects or at tiny startups trying to get users.

A lot of us work at big companies with lots of users and demand and aren't just making POCs, but every time we bring up scaling or maintenance or monitoring we are told not to worry about that yet.


Why should you be worried about large bills if you are working at a big company?

This whole Tell HN is about personal projects where costs matter, the startup tangent here is just a jab at people underestimating the raw processing power of modern hardware.

Big corps can do what big corps do with their big corp money. 200 user startups don't necessarily have hate kind of money but pretend they will be big corp very soon and thus need to spend big corp money.


Fair point. If your usecase actually needs to be able to handle massive loads, then things might be different.

But then there will probably needs to be a devops team and careful planning how to architect infrastructure anyway.

Also in such cases cloud pricing becomes prohibitively expensive unless you get a special deal.


I'm a fan of boring technology too, but I would like to suggest to you that Serverless _is_ kind of boring.

Essentially you just upload a ZIP of your application, and register a handler function that takes a JSON payload.

Obviously this is quite a bit more boring than a K8s cluster, with a bunch of nodes, networking, Helm charts, etc.

I would posit that even compared to something like a DO Droplet, Serverless is still kind of boring. Everything is going to fit into the model of registering a handler function to accept a JSON payload. There's no debate about whether we're going to have Nginx, or what USGI runtime we're using. It's just a function.

And with Serverless, your cost for doing a couple million, 2-second-long requests is about four cents.


> I would like to suggest to you that Serverless _is_ kind of boring

It's not my kind of boring, although I sort of see what you're saying.

It presents a simple facade, but it's built on complex infrastructure you have no way to have any visibility into. So when things go wrong it's a nightmare.

To me boring tech means super simple, KISS to the extreme. Something I can diagnose fully when necessary without any layers of complexity (let alone proprietary complexity I can't access) standing in the way.


The challenge with serverless is building systems that rely on more complex backend processes and existing code and doing things like testing using most of an existing codebase. Serverless is great for nodejs/javascript stacks that are database and front-end heavy but don't need more complexity like queueing, event streaming, or more complex architectures. Then serverless becomes a huge mess normally and the developer experience becomes a giant catastrophe, as well.

Here the OP kind of got caught by the terrible DX that is almost natural to serverless IMO.


I'd argue serverless handles those cases even better. SQS for queueing and Kinesis for streaming have you pretty much covered. Much easier to manage those than setting up a fleet of workers that are all long polling for messages and managing heartbeat signals or routing to a DLQ manually.


DX? Does that mean Developer Experience?


> your cost for doing a couple million, 2-second-long requests is about four cents.

This seems wrong to me ? Can you explain a bit more ? Are these just API requests or ?


It's admittedly a simplification and a best case. For AWS Lambda, the price is in GB-seconds, and the amount of CPU available to your function is itself a function of the memory allocated.

The price is $0.0000166667 for every GB-second on X86. So it looks like I've also misplaced a decimal. It's $0.20 per 1M requests with 1GB of memory.

Lambdas can be sized from 128MB to 10GB, and pricing depends on the resources allotted.

Ultimately, this pricing just represents the compute time for a function to complete. That function can do whatever it wants.

There are additional costs to put an API Gateway in front of it, for instance, to make it a publicly accessible API.


Lambda function endpoints[1] are a replacement for some use cases.

[1] https://aws.amazon.com/blogs/aws/announcing-aws-lambda-funct...


I'm not day to day on cloud stuff, haven't been in a while, that's why I'm asking this, not intended to be passive-aggressive:

So if I had a "hello world" function, that basically just returned a constant JSON payload...I'm seriously looking at a pittance, pennies for millions of requests? I.e. $0.20?


No. You would also have to pay for memory usage and bandwidth.

Millions of requests for a static file is easily done on a dedicated server in a few seconds if the file is small. $0.2 per request is very expensive.

To put things in to perspective: If you buy a very cheap new server for $1K, then you would be able to handle small static requests worth thousands for dollars each day according to AWS pricing. That's a nice return on investment!


yes.


It's not $0.20 per 1M 1 second request with 1G memory. It's $16.87, and $33.53 for the 2 seconds you mentioned in your parent post.

It's $0.20 per request, without memory.


Also s3 costs for storing the lambda function code, dependencies and that functions past versions.


Exactly! If you make your app compatible with serverless by following some restrictions, deployment is pretty boring.

In my opinion, it is easier to switch from serverless pattern to VM instances, and much more difficult to switch from VM to serverless without a major rewrite.


I literally benchmarked an application I've written on a €34 euro dedicated server to 480 requests per second. Which comes out at 1,244,160,000 requests per month. If I ever reach a level anywhere near that renting a few more to provide reliability and failover would again be reasonably cheap.

This all comes while keeping the data on servers belonging to EU companies. Using Cloud servers is rather dodgy as shown by the fact Google Analytics, Google Fonts, etc are all illegal.

Everyone keeps talking about the ease of just getting a new server and not having to spend time repairing servers. This is an over sold problem. How often have we ever dealt with hardware problems 8-10 years ago? I only remember one server going down because of hardware issues that cause problems all the other times there were failover servers and the hardware issues were fixed by datacentre staff before anyone even noticed there was an issue in the first place. With tools like ansible and chef, you can provision a new server super quickly so even if the server gets a corrupt configure of whatever, you can just wipe the server and rebuild.

Most of us are not at the scale where we would realsitically see the benefits of cloud computing for hosting.


> To be honest I don't really understand the sentiment that developers can get away with not knowing basic sysadmin stuff and at the same time have to spend relevant amounts of time, energy and money to get up to speed with cloud solutions, k8s and so on.

There was a thread here the other day on how DevOps has failed. And this should hopefully show everyone why DevOps is needed. Cloud infrastructure is complex and needs specialists, otherwise simple mistakes can be very costly. If I make a new lambda, you better believe I'm watching invocations.


Or does it? The whole point of GP is that you don't need the complexity that would require devops staff to manage if you are not serving > 10s of millions of reqs in huge bursts with long stretches of dead time between.

I don't know, I'm curious about the actual requirements vs the marketing buy-in and resume building that may happen when designing the system.


Amen.

I've consulted for companies that went all in on AWS Lambda + AWS SQS, only to transition them to an EC2 instance that performs the same computations at a fraction of the cost.

No — you do not need K8 on day one.


> EC2 instance that performs the same computations at a fraction of the cost

And typically with much lower latency/request times.

GCP functions, AWS Lambdas, etc... my anecdotal experience is that they are way slower for request times vs a $20 VPS running the exact same workload.


But managed K8s is so easy to setup and makes life and the development cycle so much easier you're basically robbing yourself if you don't.

I assume you just dropped a shit ton of features on your move to that EC2 instance unless of course you wasted more time then it would take to get everything working better on Lambda/SQS.


The actual compute is decoupled from the CI/CD pipeline. Moving to an EC2 instance does not preclude us from leveraging Lambda and other technologies.


> easily handle millions of requests per day

Yes. People don't realize how little that is. Million requests per day is 12 requests per second, on average. Even if we factor in peak load at 20x that, it is still well within reach of exceptionally modest hardware, assuming at least vaguely sane software stack.


Yeah, but how will you scale up your multi-dozen-users-at-the-same-time app easily then?


Scaling is a hard problem. I've worked at places where we used a whole network of auto-scaleable services and guess what - you will still have problems. Each managed service has tradeoffs, often ones that you only encounter once you'd made a substantial commitment to that service. There is no free lunch and you're fooling yourself if you think there is one.

Often you get lucky and the cluster of managed services you select happen to scale along the metrics of your resource use. Many people's resource use patterns are similar and the managed service people take advantage of that. This is nice! But it's a trade for downside risk: you may find that your resource use patterns differ from the 90% case and your spend goes up very fast or your scaling hits a wall.

In my experience a lot of designing a service architecture is picking where you want your complexity. Services (in or out of containers) running on VMs have a simple billing and architectural model. In my experience they form a good basis to organize your other resources around and are a good foundation to grow from.


I think this is a joke people missed. Surely scaling multi-dozen-users-at-the-same-time was obvious enough that people didn't start talk about scaling seriously, but that's what happened.

If you app has multi-dozen users, and needs to "scale" something is wrong.


Yeah, my joke didn't scale well. We did well with servers (sans cloud) and comolex apps before.. something changed in collective mindset in the meantime.


How do you scale your cloud app?

Usually, the hard part of scaling isn't raw compute power, it's scaling your datastore after you've already exhausted the option of throwing more compute power at it, and this problem remains whether you're on the cloud or not.

Until you hit that problem however, throwing more hardware at it is the right solution (you may be surprised just how much load a single Postgres server on bare-metal can handle).


I agree with this in general, but would caveat that AWS etc. have made throwing more hardware at the datastore solution a lot easier. You're right that a bare-metal Postgres monster can serve a lot--but Aurora Serverless V2, if you can live with its (pretty mild IMO) quirks and if you can pay for it, is a profoundly hard-to-argue-with offering.


Buy more servers, loadbalance, and don't take the short painful route architecture-wise?

It isn't really that hard.


Can't wait to end my next system design with "just buy more servers. scaling isn't really that hard."


I've done exactly this in many meetings. It's a balance between infra costs and dev time optimizing spend. Typically ping pongs back and forth.


There is a point where that becomes exponentially prohibitively expensive.

At a former startup where I worked as senior engineer, that was our original approach. Then one weekend we tripled our userbase and horizontally scaling required massive changes in database architecture, sharding solutions, etc.

"Just buy more servers and load balance" is the short painful route. Carefully planning out and taking advantage of scalable architecture that can be provided to you less expensively because it too, runs at scale is the hard method. The fact that its easy to shoot yourself in the foot with it doesn't make it the easy route.


>Then one weekend we tripled our userbase and horizontally scaling required massive changes in database architecture, sharding solutions, etc.

You took the short easy route out of the gate instead of building for what you were planning to handle. This is the opposite of what I espouse. If you were going about it my way, you build that in from the ground up. Harder and a bit bumpier to roll out from the get go, but has worked well for me in the past.


If you want to spend capital for an unpredictable eventuality that may never pass over taking advantage of another company's investment to scale cost... sure, that's definitely the hard route.

I wouldn't call it "good" or "intelligent", but it's definitely hard.


This. The company I work for use AWS and that makes sense for the level they scale at etc and I'm sure they work out a deal with Amazon.

I have a few dedicated servers monthly cost ~100$, never ran into any problems, for a brief period I needed to pay cloudflare for some dns management that was getting a bit heavy, but even that was because I didn't know enough as to how to optimize.

For any personal projects etc / POC's even startups launching to less than 100k daily users (traffic and load type dependant ofc) hold off on the AWS>

I also don't like the lambda architecture in AWS as it seems to lock you into it in a sense, you're almost tied to that infrastructure for good. Yes you can rework it but thats tech debt that may not be possible.

Love AWS but only as an enterprise solution.


You're going to understand where the cool kids come from once your single dedicated server goes down or can't handle the load any more. As soon as you try to scale horizontally or become highly available and start to think about how to do it you end up falling into the same rabbit hole.

> If you also use containers you even get quite a bit of flexibility and agility.

Yeah... and then the only difference is between a single host and multiple hosts. Guess which one Kubernetes is for?

With all due respect and no offense intended, your perspective sounds a lot like "I've never attempted to scale so I can't understand the problems"

> can get away with not knowing basic sysadmin stuff

I promise you that the knowledge you need for your single dedicated server is also needed for k8s clusters, and I definitely can't imagine anyone who can maintain a k8s cluster but can't maintain a single linux host. It's more like that scaling horizontally comes with exponentially more difficult problems than what you're used to or have heard of.

Obviously the billing is an issue but that doesn't negate the whole concept. They should definitely implement better expense controls. A simple hard cap to activate if you want would fix this for good.


For 99% of projects you're never going to hit the point a single server (or group of servers if you truly need redundancy for some level of uptime) can't handle the load. For the other 1% that end up needing that scale I have a hard time accepting it's actually better to start building for massive scale day 1 instead of day 1000.


This is such a narrow perspective. It’s not always about load, most applications can benefit from some level of redundancy and using AWS etc doesn’t equal a massive scaling operation forced upon you. In fact, the other options would. I'll give you some common examples.

You have a script that needs to run, without fail, at a certain time every day.

You have an endpoint that is mission-critical and even a second of downtime would cause insane manual workload.

You have an endpoint that needs to have as little end-user latency as physically possible.

These are all real examples from my work in fintech. The solution to all of them is a few dollars in cloud functions in Lambda and Lambda@Edge. Now imagine having to provision, orchestrate and maintain dozens of dedicated hosts all by yourself, simply for these handful scripts. The expense of employee time alone would make it absolutely idiotic to go that route and if anything this would equal massive scale day - but even worse, because you didn't even plan for it beforehand.


I work in fintech (banking and payments) and most of our clients aren’t even allowed to host on aws; they have to go for a local provider because aws doesn’t have a hosting hub in their country. Not sure what part you work in but this has never been a problem in the past 20 years with just servers, switches, load balancers etc.

I prefer aws over metal for these kind of setups, but for many other cases I definitely do not; just a dedi or a vps with docker or k8s and/or something like openfaas is enough for almost all startups and beyond. Making it literally impossible to make mistakes like OP. And when needed maybe failover or load balancing.


I don’t know about that, the hedge fund I work for is US-based and AWS can be fully SEC and FINRA compliant.

We also have a few dedicated servers, but mostly only for infrequently accessed data and logging that doesn’t need to be highly available.

I really can’t understand why this argument keeps coming up. Different solutions for different usecases. Yet anytime Kubernetes or cloud functions are discussed people come in and go like “hurr durr my single Hetzner dedicated server can do all of that and doesn’t have these problems”


Because some people here do exactly the same thing with aws/cloud. Not you but many here treat aws like it’s the thing you should use ‘because scaling and failover and omg 1 sec downtime’. And then the stories like OP showing that it is dangerous and my experience that most who do this are overpaying and could do with a hetzner server or, better even, a $5/mo vps, even if they do it right (which they are not, generally from all setups I have seen).

I agree with you though on the case by case.


Obviously the billing mechanism (or, lack of control) is an issue but that doesn't negate the whole concept. They could (and should) easily fix this by implementing hard caps to activate if so required.


Nobody says there isn't any use case for AWS. The point is that "the cool kids" like to start their side projects on AWS. Nothing about it is mission critical. I guarantee you op is not working on some Fintech stuff.


How's that bad? Isn't this how we all learned? By playing with cool modern technology?


I feel like learning how to configure bird, bind, haproxy, rdbms clusters, redundant MXs, etc. has taught me far more than spinning up equivalent virtual infrastructure on a public cloud would have. Getting to play with the underlying FOSS technologies is far more rewarding to me than using the commercialized versions on someone else's stack. Plus it helps me evaluate and architect my products for competing clouds, rather than my "knowledge and understanding" being tied to one particular vendor's offerings & lexicon. (Another advantage of knowing the underpinnings of a cloud is that it makes reading post-mortems[1] with a morning coffee so much more enjoyable.)

Assuming we're talking about side-projects/hobbyist development: when I'm doing that I want as few variable expenses as possible, and they usually don't require the purported benefits of the cloud. If such a project needs to scale: I'll bolt it onto a cloud at that point, or sprinkle in specific services to address the pain points.

[1]: https://tritondatacenter.com/blog/manta-postmortem-7-27-2015


> I feel like learning how to configure bird, bind, haproxy, rdbms clusters, redundant MXs, etc. has taught me far more than spinning up equivalent virtual infrastructure on a public cloud would have.

I agree with this, but the choice of what other individuals play with is not ours to make. There's still countless of youngsters working on bare metal, for whatever it's worth, they just grow up in a world with a lot of existing abstractions now.

> Assuming we're talking about side-projects/hobbyist development: when I'm doing that I want as few variable expenses as possible, and they usually don't require the purported benefits of the cloud. If such a project needs to scale: I'll bolt it onto a cloud at that point, or sprinkle in specific services to address the pain points.

100%, but the choice is free for anyone themselves to make. OP is furthering his knowledge and experience in a very specialised and highly sought after sector. Maybe he'll come to the same conclusion at some point, or maybe he comes from bare metal and has different perspectives now and thinks that redundancy is always awesome.


Pretty easy, OP sounds exactly like a guy who would have been perfectly fine with a single (Hetzner) server. Then these arguments popup and they are most of the time right. I agree with you, that it always depends on the use. However, hurr durr Hetzner Server seems to be the more reasonable choice here (once again).


OP here.

Hetzner provides incredibly beefy machines for a good price.

I’ve been managing 2 machine cluster setup in the past using Hetzner and it worked fine.

Unfortunately I have had an incident with one machine going bad unexpectedly(RAM issue) and I was forced to spend whole day setting up a new one.


Why? If you had the choice between complete redundancy and infinite scaling by default while having almost zero work, or using a dedicated server that you need to configure and constantly maintain, what would you choose?


If with one thing, a single mistake can result in a $5k bill, and with the other I have a guaranteed fixed bill of 50 bucks a month, then unless I have enough money to burn, the choice is crystal clear. Maybe you wipe your butt with 100$ bills. Others aren't that lucky.


Exactly; if it is ‘the same’ but with a guarantee of max 50$/mo then sure, I would pick that, but that is not the case, also not with caps. So it is an entirely different case and usually the 50$ option actually brings you very far without any financial risk outside that $50/mo.


The choice is only crystal clear until AWS etc implement hard caps to activate if needed. Then having some Lambda functions is not only exponentially cheaper than a whole dedicated server and by default highly available and redundant, but also equally as safe financially wise.


> You have an endpoint that needs to have as little end-user latency as physically possible.

If you need the lowest possible latency then you should definetly do it yourself. Lambda, even with provisioned concurrency has a very high latency...


If the solution you're building is business-critical then you already have a problem day one: you need high-availability. That means you need at least two servers. As soon as you add that additional server your troubles have begun. Doesn't really much matter whether you add 1 server to your setup or 100.

Your problems compound if you have to consider disaster recovery (DR) and need an offsite location you can failover to. Now you have to contend with data replication and all its concerns.

So even if your application only has a few dozen users things can get quite complex quite fast. AWS makes it simple, practically push-button. Deploying in multiple Availability Zones (AZ) is a no-brainer, and it isn't horribly difficult to deploy in multiple regions, complete with data replication. All while having no servers to configure and maintain. No containers to contend with, no Kubernetes to fight with. To me it's the most stupid-simple solution available today and it's ridiculously cheap! At least so long as you test before you deploy for world exposure!


I completely agree with you. I really suspect some people have never seen themselves how insanely, mind-bogglingly powerful AWS is.

Try doing this [0] with your dedicated servers in under 10 minutes.

And then try attaching highly-available scripts/cloud functions and dozens of different integrated functionality in seconds.

And then try setting up good Network ACL, firewalls, route tables, NAT gateways, load balancers with a few clicks.

... and people here are seriously suggesting that instead using AWS would equal a massive forced scaling operation. Lol.

[0]: https://i.ibb.co/TKmB9HX/image.png


Sure, but not all of us are working at tiny startups or on personal projects. Many of us here work for large companies that have real scale already and are trying to solve those problems.

Why does everyone assume everyone else works at tiny startups?


At that point you're already at day 1000 and already know what scale you need for the project so none of this conversation is relevant. You already have hit the point you need to build it a certain way.

I've never worked at a startup, until last year I never worked at a company with less than 100k employees.


> With all due respect and no offense intended, your perspective sounds a lot like "I've never attempted to scale so I can't understand the problems"

With all due respect, I don't think you've ever actually put together a local cluster.

A simple 4 machine k8s cluster sitting literally on dirt in my basement can scale out to the equivalent of thousands of dollars of AWS spend a month. I broke even on the initial purchase outlay for my workloads in less than a year.

The problems are almost never scaling the web servers. The problem is scaling the infrastructure that those servers need to be useful.

Generally - your DB is the first pain point, your network is the next.

If you can run it in a container, that service probably isn't the bottleneck for scaling, it's going to be whatever is providing the persistent disk for that service, and the network between the two.

Both of those things happen to also be fairly expensive to scale in the cloud as well.


I have a k8s cluster in my basement (concrete floor, not dirt.) It's fine for dev work, but unsuitable for production. The real problem is redundant, reliable internet: cable internet upstream bandwidth sucks after 20 - 40 megabits/sec, and can't get fiber here yet.


This is fair.

I've used both comcast and google fiber while running my basement cluster. I think on google fiber I would have said production usage would be fine - I would pretty reliably get 1gbs down and ~800mbps up.

I wouldn't have redundant net, but the upload speeds are fine.

On Comcast... Comcast is really kind of a joke. Even on expensive business plans they cap upstream at 35mbps in my area (down is 1gbps) and the fastest way to get throttled to actually try using that upstream capacity.

Comcast does give me redundant failover to 5g by default now, which I guess is nice, but it's not worth much when I can't send a ton of data over the line anyways.

So point taken - if you're going to have to rent a room in a major metro for fiber access anyways, you probably want to be paying for cloud services (sadly - you'll pay a lot, since outbound traffic costs are high, and if 30mbps upstream isn't doing it for you, you're gonna blow through the free 100gb fast)


> With all due respect, I don't think you've ever actually put together a local cluster.

I have, and still do. It's neither redundant nor highly available. The power source isn't, the internet connection isn't and it's also located in my basement and not multiple regions.

> The problems are almost never scaling the web servers. The problem is scaling the infrastructure that those servers need to be useful.

There you go..

> Both of those things happen to also be fairly expensive to scale in the cloud as well.

No. [0]

[0]: https://i.ibb.co/TKmB9HX/image.png


With regards to your image... so what? Virtual networking equipment is virtual - I too can spin up hundreds of subnets for nothing.

My internal transfer costs are also.... drumroll... zero. (ok - technically, at some point I bought a 10gbs switch and 500ft of ethernet cable)

Lets talk about what it costs you put data back out onto the net, or into a different region. Then lets compare long term storage costs.

Because that shit is only cheap in the "hobbiest" range. Outside of that, outbound data and storage is fucking expensive in the cloud.


> I too can spin up hundreds of subnets for nothing.

Really? In completely different physical locations with inter-region routing and NAT gateways for each region that route into a redundant load balancer? Then you're right and don't need AWS any more. However, I'd suspect your costs would not be very affordable either.

> Lets talk about what it costs you put data back out onto the net, or into a different region.

If you're seriously trying to compare your residential internet connection with something like this I'm not going to go there..

> Then lets compare long term storage costs.

Same logic applies.


You do realize we're agreeing?

I'm saying the things that are hard to do yourself aren't fucking cheap in the cloud either.

My counter point (and the one you keep dodging) is that the things that are EASY to do yourself aren't fucking cheap in the cloud, and they scale out just fine for the vast majority of non-Saas businesses.


I realize, but I still disagree with one of your points:

> My counter point (and the one you keep dodging) is that the things that are EASY to do yourself aren't fucking cheap in the cloud, and they scale out just fine for the vast majority of non-Saas businesses.

And I don't even disagree with it for its inherent correctness, but because nobody ever said you absolutely have AWS for something that doesn't need it. Most users here tend to discard cloud platforms for even the most obvious and suitable use-cases, and keep on iterating how their single dedicated server can do all of it. I'm also not a big fan of telling someone they don't need to use AWS if it's clearly a side hobby project because that is how we all learned.. by playing with cool modern technology for fun.

The real issue is super simple.. AWS just needs a hard cap functionality so stuff like this can't happen.


> I'm also not a big fan of telling someone they don't need to use AWS if it's clearly a side hobby project because that is how we all learned.. by playing with cool modern technology for fun.

I guess I don't really see the appeal of putting a hobby project into a space where a simple mistake can literally cost you thousands of dollars, with very little recourse.

I get to play with the vast majority of the fun tech - I just choose to do it on a k8s cluster I own, with extremely predictable costs, and frankly - more scaling capacity than I think most businesses need.

> Most users here tend to discard cloud platforms for even the most obvious and suitable use-cases, and keep on iterating how their single dedicated server can do all of it.

And this isn't really my stance - I'm absolutely not advocating for a single dedicated server (fuck - even my hobby instance is HA, 3 machines with a network lb, on a 5 hour UPS, with a redundant network. I personally don't care as much about the redundant network, but comcast provides it free to business customers in my area anyways through 5g equipment).

But I do think most (and I mean 75%+) businesses can spend 10k on a single server rack and go for literal years without problems.

If your business is not selling software as a service - don't buy into the bullshit marketing.

Even if your business is selling software as a service, think about it first. Small businesses tend to get sweetheart deals on costs in AWS, GCP, and Azure because their usage costs for those companies amortizes out to roughly zero, and because they know damn well that they'll end up paying through the nose if they make it to mid-size (or even just screw up once or twice - two fuck ups like the one here and you'd have bought yourself a decent server rack!)


> It's neither redundant nor highly available. The power source isn't, the internet connection isn't and it's also located in my basement and not multiple regions.

For a lot of projects this doesn’t matter. The cost of downtime might be lower than the cost of high availability and occasional downtime is OK.


I'm not trying to dismiss the value there is in having highly scalable cloud architectures available to the projects that need them, but a huge proportion of projects will never need them.

Most projects don't even actually suffer for modest downtime, although you can achieve AWS-comparable downtime even without AWS/Google/Azure-style "cloud" architecture.

It can make a huge difference in operating and engineering costs to know which bucket your project fits in. Good architectural foresight can even let you smoothly move from one to other if unexpected growth or a pivot indicate that it's warranted.

Engineering is about understanding the scope/scale of problems, not about being dogmatic or getting caught up in problems that don't apply.


I was thinking more of when the unpatched Ubuntu 12.04 is exploited via an SSH 0day and the server is used to host a Citibank phishing website.


Most people don't need to scale and even running 5x redundant servers is cheaper than a comparable cloud solution.


Running Lambda, I get a million calls per month for free. Then it's 20 cents per million calls.

Just curious - have you really researched cloud solutions or did you just compare the price of hosting EC2 instances in AWS vs having your own server? Because that's not what cloud is about.


> Running Lambda, I get a million calls per month for free. Then it's 20 cents per million calls.

And you are overpaying by a significant chunk. My raspberry pi - the old cheap one - can handle 10 million requests a day without breaking a sweat. If I push it, I can get up around 90 million requests a day without too much effort (it's only about 1 request/ms)

I really don't think most devs understand how fucking cheap hardware that's comparable to these services is.

Now - you might be using a host of other valuable features that your cloud provider gives you (things like edge servers near your customers, or truly significant outbound network traffic flows, or a very robust multi-region setup, or disk backing of some sort, etc).

But generally speaking - you ARE overpaying for cpu cycles in the cloud. It's not really up for debate.


Of course you are paying more for AWS than for a damn raspberry pi.


You say this like it's obvious that AWS is better.

There are certainly cases where AWS can be better (ease of edge networks and multi-region availability come to mind)

But outside of a very small set of cases (most of which are when companies are victims of their own success - which is actually a wonderful problem to have) what's the compelling reason to actually use AWS if "Of course I am paying more" for it?

In my opinion, if you can still run your db on a single machine - you don't fucking need the cloud yet. That covers a pretty large chunk of businesses.

Most of the "cloud" is convincing business that could self-host their entire stack on a raspberry pi in a basement that they should be spending thousands on cloud compute costs a year.

Fuck - the most damning evidence is simply how much fucking money these companies are making from upselling you cpu cycles that they're getting mostly for free. For the 4th quarter of 2021, amazon reported a profit of 5.2 billion on ~18 billion in revenue from AWS.


So I'm actually in the process of doing Lambda versus persistent costing right now for a new project, where it's heavily load-based and very spiky, but the work on each packet of information is actually very lightweight. The tricky part here in AWS is not Lambda, which is pretty reasonable in general--the pitfalls I'm seeing are around data storage. DynamoDB is stealthily very expensive, either provisioned or on-demand. Right now I'm converging on lambdas for compute and RDS/S3 for storage, but what really makes AWS shine for this use case more than anything else is SQS.

SQS is so good and that there isn't a great, easily deployed (sorry, RabbitMQ) option for exactly what it does is a real push towards AWS. (GCP Cloud PubSub is close enough, but I know AWS way better than I do GCP.) If I didn't need this, and I felt confident managing RabbitMQ as a queueing solution, I don't think AWS would be a compelling solution because $60 in Hetzner nodes, in a HA configuration, could do a lot of work.

(If this thing works I'll need an on-prem solution anyway, so I'll probably have to build that too--but that's "good problems to have".)


DynamoDB is not required to use cloud functions. You can either use regular RDS like you said, Aurora or even just your own EC2-based cluster (all of these you can attach to the VPC the functions are attached to as well) and there's a lot of nice developments going on like Cloudflare D1.

Totally agree on SQS, knowing PubSub as well I'd say they're pretty much on the same level. All the interconnectedness is where cloud platforms shine


To clarify, I mentioned DynamoDB because it has a unit-costs model rather than an amortized-over-time cost model, and when you can harness that (and then focus on a simplified COGS model) you can do very well.


NATS and NSQ are better than SQS if you can deploy them yourself. ZeroMQ is a great option as well, but it probably will require some major brain changes, since it doesn't fit most people's mental model of what a queuing server looks like.


ZeroMQ is too weird for me, yeah. NATS JetStream and NSQ are both promising but their durability options are unclear and while I have a background in devops/system architecture, I'm looking at building something whose entire staff is literally only me and so while I'm sure I can deploy it, I'm not sure I can effectively run it. SQS will do what I expect it to do, and that's pretty powerful.


disclosure: I work at Synadia

We do have a hosted version of NATS called NGS that is multi-cloud, multi-geo and is really easy to set up https://synadia.com/ngs


Your pricing is really very fair, especially since it's based on resource consumption and not per-request. One suggestion: put your pricing right on the home page. I'm starting to wonder why I am running it myself!

And, you probably know a little bit about NATS, too, since, you know, you wrote it! :)

https://nats.io/support/


I think NATS is an excellent, and very easy to run and deploy. Pretty much has a "just works" attribute I like to assign to boring technology


And? I don't think you are aware of how cheap metal is these days.

Plus the OP has already decided that free tiers don't work.


If you're putting your service behind CloudFront, you're probably aiming for DDOS protection and low latency or something that you can't easily get from a VPS. Of course, you can put your VPS behind CloudFront, but you would run into the same issue (albeit the fixed capacity of your server would effectively cap your bill).

A single VPS is great if you don't need reliability or scale, but if you care about those things you probably need at least a few of them running behind load balancers, and you probably don't want humans manually poking at them (but rather you want some reproducibility). Moreover, there's a bunch of host management stuff you're taking upon yourself--configuring metrics collection and log collection and SSH and certs and stateful backups/replication/failover and application deployment and process management and a whole bunch of other things that come for free with serverless.

Implying that serverless/Kubernetes/whatever is just hype ("cool kids") while ignoring the pretty significant discrepancies in requirements is pretty silly.


Cloudfront is terrible for DDOS protection though, given that there is a per request pricing (and no, AWS WAF doesn't help as it just imposes additional pricing on top, the only thing it does is to prevent Cloudfront from processing those requests.)

However, if your statement is generally about CDNs such as Cloudflare or Bunny.net that don't have per request pricing, it makes sense.


E.g. Hetzner will give you DDoS protection for free with every dedicated server you host with them. If you need more security you can also put another provider in front of it.

CloudFront on the other hand will bleed you dry with their pricing.


With almost service like hosting where there could potentially be unknown costs I always select services with fixed fees and pay annually via PayPal so no card info is stored.

If you have to pay by card and you have to subscribe always use a disposal virtual card. Most of the time these companies won't pursue you for payment because it's simply not worth the effort. The most likely outcome is that they'll just suspend your account. What you don't want happening is them taking money from you before you're even aware what is happening because that way you won't get it back and there's no room for negotiation.


Another not cool kid here. I've been running my little Saas for 12 years on fixed price cloud vms, €18 total per month (initially shared hosting). I'm sure I will ever hit my providers 20TB monthly traffic limit.

I settled with running lxc containers. I just couldn't bring myself to be bothered about docker. Before I left my last job we started using k8s. I wouldn't use k8s unless in an enterprise environment. Even then I would consider the need. "But Mom, all the cool kids have k8s!"


> To be honest I don't really understand the sentiment that developers can get away with not knowing basic sysadmin stuff

It's not that. I used to own a VPS ISP. I know how the infrastructure stuff works.

It's that I don't care. I want to write code and deploy it to a new machine.

The time savings as a developer is enormous.


Standing up a service isn't only about compute. Where is your logging, metrics, alerting? Even with a server, you need methods for access control, hardening, log rotation, updates/patches (what happens when the OS goes out of support?).


Where would I learn how to set this up, this 'basic sysadmin stuff'?


Cloud functions are not used to look cool, nor are they used solely by people that “don’t know basic system admin”. Their use cases don’t line up with your needs and that is okay.


One benefit is compliance. If my SaaS runs on Lambda and managed DB it offloads a bunch of OS related security requirements.


Also , $100 a month can get you a business line with a static IP for self hosting.


I never understood people who are in such a rush not to survey whats available.


Powerful yes, what about getting good peering and routing?


the cloud means you get to write sexy code and call yourself an engineer

not the cloud means you write boring configuration files and call yourself a sysadmin


Where can you get a dedicated for 40?


Check https://www.lowendstock.com

Also Rasmus (of PHP) wrote a series of posts https://toys.lerdorf.com/low-cost-vps-testing a while ago comparing VPS providers in the $10/mo range.

It was discussed on HN https://news.ycombinator.com/item?id=21725853


Thank you


Hetzner, OVH. Hell, if you go with Kimsufi (OVH's budget brand) you can get dedicated for a dozen bucks.


I once committed my private AWS keys to a public github repo. A bot scooped it up nearly instantly and spun up many, many ec2 instances that were (probably) mining bitcoins.

I received an automated email from Github telling me that I had committed a private key, but it came in the middle of the night.

In the morning, when I learned what had happened, my bill was over $3k.

I fixed the issue and emailed AWS asking for some relief, and they called me and let me know they were waving all the charges.

So, perhaps you too can beg for mercy?


The difference between his situation and yours is that you didn't create the charges. Legally you're not liable for something someone does while impersonating you, even if you walked around with your private key on a t-shirt. They may or may not be nice to him but for you they didn't have a choice.


I don't think that's true? I mean sure, you might not legally be liable when someone impersonates you in the real world. But I'm absolutely certain the AWS terms say somewhere that you agree to take care of your creds and are liable for whatever is done with them, etc?


Both could be true. A contract can say anything, but it's going to be bound by the legal framework it operates in, and in this case I don't think there's much of a distinction between the digital and real world, except for physical resources not changing hands.

Hypothetically, the contract could say Jeff Bezos will come to your house and personally kill you, but there's no consentual murder in most places


You can’t sign away your legal rights.


It doesn't matter what the terms say. The charges would be the result of a violation of Title 18 Code 1030 - it's the digital equivalent of someone stealing your car and writing the title over to someone else. You're entitled to keep your car (or your money spent on AWS) regardless of the receiving party's expectation of claim to it, even if they incurred loss in the process.

Now, Amazon would be entirely within their rights to cancel your account and refuse to do business with you after this, but they would not have the right to collect that money from you, or to keep that money had it already been charged to you.


Title 18 Code 1030 says it is illegal to commit computer fraud but it is not a responsibility of your service provider to eat/pay for fraud committed against you.

Your only legal recourse under Title 18 Code 1030 is against the "violator". Amazon did not violate your computer systems and commit these offenses.

> Any person who suffers damage or loss by reason of a violation of this section may maintain a civil action against the violator to obtain compensatory damages and injunctive relief or other equitable relief.

On that basis, your contract stipulates who is responsible for fees associated with use of your AWS key by "any other third party".

> You are responsible for all applicable fees associated with use of the Services in connection with IAM, including fees incurred as a result of any User Credentials. You are responsible for maintaining the secrecy and security of the User Credentials (other than any key that we expressly permit you to use publicly). You are solely responsible, and we have no liability, for any activities that occur under the User Credentials, regardless of whether such activities are undertaken by you, your employees, agents, subcontractors or customers, or any other third party. You are responsible for the creation, distribution, and security (including enabling of access) of all User Credentials created under your AWS account, including credentials that you have used IAM to create or disclose to other parties.


You have a fundamental misunderstanding of the positions of the parties in this scenario.

The computer fraud in this case was not committed against you. It was committed against Amazon. Amazon grants you access to their services, the account does not belong to you. The damages here are not made against you, they are made against Amazon.

Just like in my example, the violator committed fraud against the "buyer" of the car. Neither Amazon or the "buyer" have recourse against you for the supposed owed property/bill, they have to extract damages from the violator. You are not responsible.

On your second point, I will repeat myself: it doesn't matter what the terms or contract say. Such agreements commonly hold terms that are in direct opposition to US law and have no legal basis. Their entire purpose is to dissuade you from pursuing your legal rights at a cost to the company.


I mean, for all we know it could have been him mining the bitcoins, with committing the private key by accident being the cover up story.


The legal burden of proof for that lies on Amazon, not him, at least in the US.


Legally you're not liable for something someone does while impersonating you

This unfortunately isn't true. It also sounds like he created an app key from his root account that enabled anyone to literally impersonate him.

A typical use case is to create a user that has only the specific rights that are needed and generate an app key for that user. For example, I have a user that can only read S3 buckets. If it were to leak, the worst that would happen is I would leak some encrypted backup data.


I don't think Amazon is going to evaluate this on the legal merits.

In a situation like this, Lambda is almost pure profit. Their actual spend here was negligable.

They are almost certain to waive the fee, because they don't want the perception among developers that AWS is a time bomb.


Consider yourself lucky. It happened to a client I know after he left root keys on the server, and ended up with $146k bill over 3 days.


I think every developer has an AWS billing horror story.

My horror story is that my site allows users to upload videos and share them to a limited number of colleagues. When a user requests a video, a CloudFront URL is created that lasts a few hours.

I had not thought much about hotlinking because the link only lasts a few hours - what would be the point? Well, those few hours make a big difference when it’s linked on a high traffic website.

Turns out someone paid for the cheapest plan ($7) and uploaded two multi-GB files. They hotlinked them on a Vietnamese porn site and ran up charges of almost $10k.

I was alerted by Cost Anomaly Detector but it had already run up most of those charges (and the totals CAD listed were much smaller and made it seem like less of a problem, thus delaying my reaction). AWS, to their credit, waived the charges.

I had WAF already setup but it wasn’t very helpful for this type of thing. I could only block sites that I already knew about. I ended up going with a Lambda@Edge solution that validates the source site before allowing access.

Lessons learned: 1. Customers may abuse things in ways you didn’t predict 2. Cost Anomaly Detector has a delay and only kicks in once charges have accrued. It can save you from an insane bill but won’t save you completely from large bills. 3. AWS can be reasonable about this but the ball is entirely in their court.


How can two few gb files = 10k usd of data in a few hours?

How popular is that vietnamese porn site!!


Worth noting that if your distribution is set to use every region, Asia-Pacific CF pricing is actually more expensive than raw S3. $0.12 vs 0.09 (for S3) or 0.085 (for North America CF). It's easy to accidentally increase your costs by 33%, since you'd only encounter this with S3 if you put the bucket in an asian region, versus CF where distributions are more hand-wavey about locations.


> How popular is that vietnamese porn site!!

Getting to the important question here!


I want to offer two counterpoints to common sentiments here regarding AWS billing.

1. Don't be afraid of playing around with AWS (and even spending some money). AWS is really good at refunding you if you accidentally rack up a couple grand in surprise bills. Also even if you legitimately spin up big servers to try a kubernetes cluster for a couple of days, that $20 you spent is almost certainly great bang-for-buck for the benefit of learning that experience and getting your hands dirty with AWS.

2. AWS billing is actually really good for what it is. If you've ever run any non-trivial operational system (in the real world), you would know how hard it is to collate all expenses and get them tallied up. AWS collates all billing data with ~24h lag and you can slice and splice it to your heart's content. After all it's a complicated distributed system that they've managed to build that doesn't slow down your services or otherwise get in the way!


Azure supports hard stops on services with billing maximums. It does mean that stuff gets turned off if you enable that. Then again, as an individual, that's a superb way to control costs.

And since Scamazon doesn't do that and INSTEAD "gives" you a 1 month unlimited credit, there's no telling just how stratospheric your bill can be.

> AWS is really good at refunding you if you accidentally rack up a couple grand in surprise bills.

If there were hard limits, there'd be no need to beg AWS support for leniency, which they can capriciously choose you don't deserve.


Azure doesn’t support spending limits on all subscription types though.

https://azure.microsoft.com/support/legal/offer-details/


Completely true. However if what you're doing is highly price sensitive and willing to accept downtime over ScaryBill, then this the option you all need. And it's completely (and I bet intentionally) not available on AWS. AWS's message is "bend over and we'll tell you how far and long".

Larger companies see $4k as a nothing; pay and move on. Household budgets, not so much.


If there were hard limits, that would also mean that the billing system is on the critical path for all systems, and not just an after-the-fact ETL.


Not necessarily. A cloud provider could retroactively cap charges at the hard limits, but only cut access to resources asynchronously. That's effective what happens when you complain now with AWS.


If AWS wanted to, they could absolutely implement a hard cap. It's not like letting some services run for a few hours until billing catches up costs them a lot of money.

What is true--not necessarily in order is:

- I suspect AWS in aggregate probably makes a fair bit of money on overages that a user eats but would have had a hard circuit in place if they could have, and

- Even reasonably designed hard circuit breakers (e.g. we cut off access to your stateful data unless you pay your bill but we won't delete it for 30 days) are still giving developers a potentially well-hidden foot-gun for a production environment that management might not actually want.


Just like in real life, if you run out of money you have to stop doing things.


> Just like in real life, if you run out of money you have to stop doing things

In real life when you hit your card's limit, your transactions get declined. Straight away.

I had this last month in a supermarket after my (personal) checking account didn't have enough money to cover my purchase, I'd completely forgotten to transfer money from my business account.

My bank wasn't prepared to let my account go overdrawn, not even by the equivalent of $20, which is absolutely their right.

Amazon, OTOH, benefits in lots of ways by not implementing this mechanism.


A better analog to the store and Scamazon's implementation would be:

You put stuff in your cart.

You go through the register. You agree to "prevailing price".

No prices show up because they are "calculating".

You leave the store.

A day later, you're hit with a surprise bill that's 10-100x more than you though. A $100 transaction ends up being $1000-$10000.

There's no refunds.

The dispute procedure is to beg and hope they "let you ignore it".


> In real life when you hit your card's limit, your transactions get declined. Straight away.

Except for when it doesn't. I've got two primary current accounts, one with a "legacy" bank in the UK and one with a modern bank. The legacy bank is happy to let me go into an unplanned overdraft, and charge me for the privilege of doing so.


> a "legacy" bank

"Been there, done that". Not there any more!


For dev system, I can't see why no hard limits.


For the companies I've worked for, having HARD limits on devs would put the C-levels minds at ease.

At this moment, any dev with AWS keys has an unlimited month-per-month credit line that the company is on the hook for paying. And at best is the hope and prayer that the billing notifications aren't utter shit.


With regards (1), I feel like there are two different worlds (and given how non-transparent Amazon is, that's very believable).

I've never run up a huge AWS bill accidentally, but I personally know 2 people who have, and neither was refunded, even after asking. In both cases we are talking $400-$800, enough to really hurt someone, but not bankrupt them.


Interestingly, AWS won't refund service credits provided by an accelerator if you've accidentally blown through those instead.


Possession is 9/10ths of the law. If they have already been paid, they are not likely to refund you.


How the fuck do you blow through $150k at an early stage startup???


Storage, compute and networking mostly. I'm confused by your question - are you suggesting that early stage startups couldn't possible generate a workload to warrant that cost? Maybe if your idea of a startup business is limited to running a basic CRUD app. But other data-intensive projects have engineering needs that can easily reach that scale, especially in the early days when you're trying to bootstrap your data catalog or ML models or whatever.


Those cases are rare as hell, $150k should be enough for nearly anyone. You are making bad engineering decisions if you go through that in under a year.


Ha, well tell that to the hard drive manufacturers and cloud providers.

The storage bill alone exceeded that at four startups I've worked with. When your job is to manage X PB of data with Y TB arriving daily, you're fairly constrained on the cost floor of your operations.

> You are making bad engineering decisions if you go through that in under a year.

Fine. You lack the imagination to conceive of a use case, or you lack the knowledge and skills to get the job doing it. Either way, this comment is rude and ignorant of the diversity of software challenges that exists in the real world.


Obviously. Lmao.


Not so obvious to those of us with experience in data-intensive startups.


I've had "it needs the RAM" while staring at a 15% memory usage plot on a very expensive CI VM. Choosing a 20x cheaper machine had no effect on build time and stability, but this was the PR comment. This overprovisioned machine was possibly $20k.

The truth is that developers sometimes require expensive things to get their job done efficiently. The other truth is that we often vastly overestimate the SKU. And we also leave things running.


As someone who worked at an early stage video intelligence startup, surprisingly easy. Redshift + Elastic Transcode + Cloudfront makes it extremely easy to spend thousands and thousands of $ per month.


> AWS is really good at refunding you if you accidentally rack up a couple grand in surprise bills.

I wouldn't bet my bank account on that always holding true. If it's not in the terms and conditions that they'll refund you for accidental mistakes that lead to high billing, then you're gambling if you assume they will.


I just want my hobby server to shut off at if I spend over $XX in a billing period. Is that so insane?


I had an ECS cron job hang one time, so instead of a 30 second compute charge it ended up being almost a full month of continuous runtime. My usual $30 bill was $800, and the estimated charges for the next month based on 2 days of use was $1300. I didn't have a billing alert setup (definitely setup a billing alert!).

It could not have been easier to get AWS to remove the charge. A quick email to support with a brief explanation and it was immediately accepted. The hardest part was that they wanted a very specific request for how much I was asking to be refunded. So I had to go back and calculate my average costs per service and compare that to the charged costs. After that it was immediately refunded.

They aren't just handing these things out though. They made me read and acknowledge I had read their service agreements and basically swear that I know what happened and it won't happen again. Really painless process overall, all things considered.


Off-topic maybe, but I'm very curious as it's hard to find hard numbers:

> my usual bill is $200/month

How many req/second are you serving? What kind of things are happening?

It seems like the bills are outrageously expensive when it comes to various cloud services, as I'm personally hosting a service that does between 10-100 req/second on average during a month, and my monthly bill end up being closer to $40/month, including traffic and everything. I'm running a database on the same server, and 20% of the requests writes stuff both to disk and to the database, while the rest just reads from DB or disk.

The whole setup took around 5 hours to setup on one day, and have been running flawlessly from day one, we haven't had to migrate servers yet after ~6 months of production usage. Probably one day we're gonna have to upgrade the server to a $80/month one, but that's a one-time thing and our revenue easily covers that.


In my consulting days I helped people with issues like this quite often, and what I found tended to come down to severe inefficiency caused by people not fully understanding how services need to perform at scale and what causes them to perform poorly. You hear a lot of “oh, it seemed to work really well on my machine”, which doesn’t necessarily translate well to performing or scaling smoothly in the wild.

With a bit of refactoring and simplifying it tended to cut out a lot of issues. I suppose the issue is that a lot of people don’t really know what to look for or how to anticipate issues in complex infrastructure (which is totally fair, I only learned via trial by fire and have done some really stupid stuff).

I don’t think I ever encountered a case where the code was correct and bills were too high. Some AWS/GCP configurations would be pretty bad, but the code would also tend to be incredibly inefficient.

I always encourage people to respect people who are great at dev ops and to either hire them if they’re big enough or just consult them if they’re uncertain about things. Throwing a day of consulting rate at someone smarter than you is a great way to learn and it could easily save you money in the not-so-long term.

I’d say that because they’re crazy to rely on me for infrastructure, but I’d make things better if I was around already and they asked me to help out. But I’m no substitute for someone who actually knows what they’re doing.


I run 1.2 million uptime checks per week, my total AWS bill was $150/mo before I migrated to permanently running VMs - it's definitely doable without trying too hard.


> my total AWS bill was $150/mo before I migrated to permanently running VMs

I know I keep harping on about this, but [if you have a vague idea what you're doing] you can squeeze an awful lot out of a cheap VM.

I'm currently working on a project that we're deliberately prototyping using ultra-cheap VMs. If it takes off we know how to scale it up, if it doesn't the costs stay very very low.


Definitely - with startups offering 3x free firecracker VMs (of 256MB RAM) these days, there's almost no reason to start ideas with serverless.


Yes, moving to VM is definitely doable,

but now, being a 1 person dev team, it is challenging in maintenance.

My fear is, being on a vacation, and suddenly this VM dies. It might take too much time to bring it back online, and I might be out of good network coverage.


Back in days when our services ran on metal or VM, if we got paged during off-hours and had something else going on, a simple reboot almost always fixed the issue.

We, developers, never liked the reboots though, always wanted to find out root cause, so that we won't be paged again. So, I guess, we moved to the cloud. Now we don't get paged in the middle of night.

But yeah if it was my own company and I was on vacation, I rather take 30 seconds to reboot the server instead of worry about paying thousands of dollars.

As for lack of network coverage, you could do scheduled daily/hourly reboots while on vacation, if it makes sense for your service. Or if your an outage will cause massive disruption to your users, then perhaps hire a part-time sysadmin while on a vacation.


Self healing VMs - look into fly.io, they just restart when out of memory etc

took about a day to rewrite and a week of 30 mins per day to figure out how to optimise my code for self healing.

knowing my bill is capped by number of VMs * memory, saves a lot of stress


In case you're curious, I wrote about my experience here: https://onlineornot.com/on-moving-million-uptime-checks-onto...


It depends on how efficient the dev is.

For comparison, I've been running a paid Slack app (a few $K per year) that manages to run entirely within the free tier on Google Cloud Platform.


Paid Slack app just receives webhooks from a Slack organization right? So the amount of traffic it handles amount at max to the amount of messages that gets sent around in Slack, which tends to be very low (in comparison to other applications where user interactions can trigger many requests for example)


True, but in my app's case each webhook ends up making a few dozen Slack API calls.

In any case the free tier will definitely allow you to handle enough traffic to figure out if you've got a viable business model, which I guess is the point.


It is depends on the user activity but typically it is not so many requests. Around 20 req/second with a reply in 100-400 ms.

These recursive requests were going at the rate 17k/second with 30s timeout each.

The benefits of the current approach is I don't need to manage any servers, and I have different environments for free. Also sizing is not an issue, I just tune AWS Lambda limits to be able to serve one single request.

I will need to invest some time to understand how big the instance should be (how much memory) because I struggle to measure how big it should be without going out of memory or CPU.


AWS may promote the technologies as prototype friendly but at the end of the day its built to be enterprise grade production tool. A company will not even bother with a 4000$ mistake, its just the price of doing business so there is little incentive to address these types of problems. Playing around with AWS for side projects is like using a chainsaw, it can really accelerate your work but if you are going to make a mistake you may lose an arm and a leg :).


That's not really fair because while AWS does have a lot of issues, their refund policy isn't one of them. It's usually really easy to present a case for refunding accidental charges.


I've heard they have a generous foot gun billing policy and thankfully I've never had to find out, but we shouldn't be that grateful, because ultimately the cloud providers do this in their own rather dishonorable self interest.

It would be fairly simple for them to allow users to set up hard billing limits. Yes, it wouldn't be accurate to the second. And yes, it would mean that deployments would fail with data loss or in unpredictable ways, but in most cases that would be preferable for these users as opposed to a couple orders of magnitude increase in billing costs.

But the cloud providers don't support hard billing limits because they like people fucking up and accidentally running up their bill. After all, it's probably only a small fraction of users that go through all the humiliating rigamarole of unwinding a provisioning mistake.

So yeah, good on Amazon for being so generous with the band aids, but maybe they should try a little harder at helping their users not shoot off their toes...


> It would be fairly simple for them to allow users to set up hard billing limits.

former AWS SDE here

I don't believe it would be "fairly simple" to build a completely new off switch into 150+ services, likely with multiple integration points in each service. In addition, the mere existence of an off switch introduces new failure points, where failure directly turns into downtime.

The effort to implement this is far from trivial, removes resources from implementing other features that the really large accounts are asking for, and adds complexity with direct availability risks. It's not at all surprising they don't implement this.


IMO Google Cloud has the solution for this - access to APIs is off by default and you must enable API access before anything will work. Their portal is pretty good at estimating costs in the first place, so resources created there aren't much of an issue, but having to use the portal to enable programmatic access is a great way to avoid mistakes.


Strictly speaking even the simplest features spanning all of the services wouldnt be simple.

It's got 0 to do with why the feature doesnt exist though.


Can confirm. When I was learning the basics of EB, I accidentally spun up a bunch of EC2 instances in a region I didn't mean to that ran for ~3 weeks and racked up a $2.4k bill on the company's account.

They wrote it off ~6hrs after we filed a support ticket about it.


Chainsaw is the perfect metaphor for this, well done.


Using AWS for sideprojects is a great way to make sure you have AWS skills the next time you interview, assuming you aren't using it at work.


For small projects why do you need the scale? I feel like once you need the scale serverless is the way more expensive then even managed Kubernetes. I still think serverless is hosting services way to make far more money with the illusion that it is easier when it really isn't. Logging is normally a huge pain. Local dev is usually a huge pain. Managing versions is a pain over just git branches especially over multiple environments. It is a pain to setup different environments and full CI/CD. In then end they might be ok with prototypes but real big systems they are huge pain but that is just my real life experience.


To expand on this OP, I've done the AWS-full-stack approach in a mid-sized startup. Modern Serverless problems require modern serverless solutions. That ecosystem is simply not as developed as "traditional" web-server CI/CD. Here are some things that you will eventually need to optimize for.

- After crossing a certain threshold in scaling needs, Lambda costs more than regular EC2 on ELB

- Lambda cold-start times can be a deal-breaker when users first visit your website. If you contact AWS they will tell you to setup a simple cron job that keep lambdas "warm". But AWS provides no visibility in what's warm or cold, or which endpoints link to which lambdas.

- Dealing with Cloudwatch logs of various lambda runs (IMHO) is objectively a bad dev experience. Query insights is getting better, but is still a pain to work with.

- To reduce deployment and development times, you'll eventually want to deep-dive into lambda layers. Modern problems modern solutions.

- One lambda calling and awaiting another lambda is not a supported first-class use-case. There's no API that allows you to get the status of a lambda run. There's a hack around this where you use AWS Step-Functions. Modern problems modern solutions.

We're still on AWS full-stack "serverless" for our webserver and realtime stream processor. At the time I didn't know what I was getting my company into. I wish I just made a Flask webserver instead.


Serverless isn't just about scale, it's about deploying code without having to touch any infrastructure. The lambda free tier is also very generous (1M free requests per month).


its simple: they dont


I used to work for Firebase, this is a common problem. For my own developer focussed startup I have prevented functions from calling each other to an unbounded depth, exactly so this footgun is removed.

The technical details is outbound requests is given a role encoded in the user-agent, and then I can easily filter out incoming requests by user-agents [1].

[1] https://observablehq.com/@endpointservices/webcode-docs#opti... (see loop prevention flags)


I rarely use AWS for smaller projects, and prefer to either use Digital Ocean or bare metal from a local data center (well local when I lived in NY).

After a surprise bill like this, I would re-evaluate what serverless is actually giving me.


Cloud vs. bare metal costs should always be thoroughly calculated. Fragment from an article from current "FreeBSD Journal"[1]

> We compared the three-year total cost of ownership of a VPS, such as a DigitalOcean Droplet, against two equivalent leased or purchased bare metal servers. We estimated that the leased option costs about half as much compared to equal resources in the cloud, and owning the servers would cost less than a quarter of the pure cloud options.

[1] https://freebsdfoundation.org/our-work/journal/

[1] https://freebsdfoundation.org/wp-content/uploads/2022/06/Jou...


I'm only using AWS for my domain (too lazy to move) and even then I use an external dns manager because aws charges something like 50 cents per a dns record per month.

Everything on aws is a clusterfuck designed to suck money out of enterprise businesses


DigitalOcean is amazingly simple and I have been a big fan since they launched.


Think about it this way:

You're a small, three person group. Are you more worried about having a modest colocated server or a small VPS run out of resources because you suddenly became really, really popular, or are you more worried about who'll go without salary for a few weeks because you misconfigured something and got a bill for multiple thousands of dollars?

Amazon takes full advantage of the fact that you can get in to deep billing waters very quickly. They can't collect your billing info quickly enough? Bull - and I mean this emphatically - shit. If a company that runs some of the largest datacenters in the world can't communicate metadata in real time, they're either hopelessly inept, or they're failing on purpose to make more $.


While reading through these threads I always get a feeling that everyone's working at YouTube or stuff like that, and they need to serve millions of users per hour. Meanwhile I'm in my corner here with an old-school $20 VPS that does just fine for my 10k users.


In 2010 I served a peak a million users per hour with an $80 dedicated server managed at my pleasure with Plesk and my PHP framework.

I can respond up to 20k requests per second.

Scaling is for worldwide TOP 250 webs or webs designed according to the stupid contemporary fashions.


It's really difficult for AWS or any other serverless provider for that matter, to achieve a kind of "bulletproof and safe user experience" across different offerings that encompasses everything that has to do with billing/monitoring/alerting and then also cover all kinds of potential customer scenarios (like the function calling itself, as one example).

For example, it's totally understandable that the alarms can be specified per region, why shouldn't it be like this?

Also the global AWS billing $300 alert seems to have worked but you were asleep as far as I understand. If it was a call-out style alert, then you would've noticed in the middle of the night and could've stopped it.

The only thing I agree is frustrating is this: > CloudFront includes the AWS Shield Standard feature, but somehow, it was not activated for this case (Lambda@Edge calling itself via CloudFront).

Maybe you can argue that you weren't made aware of this but idk... keep us updated


What’s the case for not implementing an optional “shut down all my services at $spend and stay shut down until I intervene” ?


Honestly? Many people would enable it, forget about it, and footgun themselves on the other side.

Perhaps AWS should have "personal/developer" accounts that have this enabled by default and continually warn you about it, whereas "company/enterprise" don't have them.


> Many people would enable it, forget about it, and footgun themselves on the other side.

Yeah, but I figure as long as Amazon doesn't immediately remove stored data, the damage of the footgun would be minimal. Speaking for myself, of course, I'd rather have a short outage than an unexpected thousand dollar overnight expense. It seems so trivial that it's unclear why AWS would not implement this feature. The only explanation that makes sense is that they want these surprise bills to occur.


Because the downside for a company isn't "oh it was off overnight" it's "we finally hit it big and made zero sales because AWS shut us off".

Given how easily they reverse the bills, I suspect that they have a policy of doing it (perhaps a few times per account, something to prevent abuse) because they really don't want to trigger the above scenario.


Alternatively it's "we would have made a profit this month but a bug in this one service chewed through our budget in one hour". Sure, you might be able to get a refund, but that's no way to plan a business.


I’d think if your rate of spending is >$50/hour then that’s nearly-always a bug. The only reason this conversation is taking place is because serverless “infinitely scales”. Autoscaling physical instances has a max limit for similar reasons.


I've experienced plenty of scenarios where costs have quite legitimately spiked.

Ultimately whatever solution you put in place, someone is going to complain about it. At least with the system they currently have in place they can reimburse customers. Whereas it is a lot harder to fix their reputation after they've automatically stopped production services.


Given how easily they reimburse customers, I suspect it's intentional - one can be "fixed after the fact" and the other can't - if your site goes down during a slashdotting and you lose sales, etc, there's no getting those back, but if you inadvertently run costs high, they can just refund/cancel those costs.


Azure HAS this hard limit feature already.

Ive seen nobody on HN, twitter, reddit complain about "my site was down during heavy business since i turned on hard billing setting". Not a single person.

However, I see frantic after frantic post of "I was testing something on AWS and it caused me a $X000 or $X0000 bill."

But as the posts in here are apt to suggest - you can always beg AWS support for a reversal. Great plan there.


The first is obviously customer error and unless you're posting to get laughed at, you're likely not to gain traction.

(Also one could make the "nobody uses Azure" joke here.)

Personally I think that much of AWS is "way overpowered" for the normal person/business, and you shouldn't be playing with it if a $X0k bill would be impactful (as likely other solutions are much better tuned to your needs and money).


If you drop a laptop, that's customer error. You break something or do something unintended that damages it, that's customer error.

When you are handed a tool that has multiple hidden guns and explosives inside of it, and ends up blowing your foot off is malfeasance of the people who handed the tool to you.

AWS is that tool. And given that Azure can implement these guard-rails and AWS chooses not to tells me all I need to know.

> Personally I think that much of AWS is "way overpowered" for the normal person/business, and you shouldn't be playing with it if a $X0k bill would be impactful (as likely other solutions are much better tuned to your needs and money).

Please compare and contrast this with "Learn AWS for furthering your career".


They 'might' reverse those fees, they might not. You are at their mercy and mercy is finicky.


That’s literally the point I made :)


It's not really difficult. They just need a way to set hard spending limits. Probably on by default.

Unless you're a big company, "we stopped your function in the middle of the night" is a whole lot better than "we ran your function all night and you owe us $4k".


In my 25 years of running production services, I honestly cannot think of one company I've worked for that would have accepted their function being stopped in the middle of the night.

AWS already has a recourse for incidents like these: refund the spend. That is far more reliable than trusting an organisation can tolerate an outage.


The difference is none of the little sideprojects I work on are worth $4k in a month, let alone in a day. Obviously most companies want to spend the cash, but they should want as many programmers using AWS for sideprojects as possible.


It was more the "big" part of the "big companies" statement I was disagreeing with, rather than the "companies" part.


Sorry, I might have been not very clear, but AWS billing alert for $300 have been triggered only when it have reached $1,484 in charges.

If that alert triggered earlier (and $300 would have been triggered in an hour), my all accumulated charges would be only $400 not $4500.

So the hard learning here is that CloudFront charges takes time to appear on your bill, up to 24 hours.


24 hours delay in 2022 is egregious.


AWS support has historically been pretty good about removing these charges. Just be careful next time.

I racked up a $8k AWS bill for my university when I was leading a club. A few emails to AWS support and it was all resolved. Although there might've been more leniency since I was a student.


I wish cloud providers had a nuclear option. Like “if my monthly spend hits $X, then just stop everything immediately”. Often these billing issues happen on little hobby projects and things that the owner would clearly be fine taking offline to avoid thousands in fees.


> then just stop everything immediately

What does stop everything immediately mean for things that aren't compute? Backups and storage, for example.


I was only thinking about this for compute related services, which is where most of these surprise charges come from. I suppose you could have some fine grained rules for other services.


This is partly the reason why I would never use anything besides VMs or baremetal I provision myself. I'd rather have scaling problems I can solve by provisioning more hardware than billing problems because I fudged the setup. Yes, AWS might be good refunding "oopsies" but when trying to bootstrap a business I have better things to do than recover from heart palpitations.


What is this kind of sh1t!? Why do people put their software on a platform where you don't know what it's going to cost upfront?

Maybe I don't understand. Maybe there are legitimate use cases for AWS and other 'vague clouds'. And it's not only these kinds of bills you get, but I also heard one even has to pay to get your own stuff off of these platforms.

What's wrong with good old webhosting (in whatever shape, size or form)? I understand there are use cases - probably for big big apps - that will benefit from cloud hosting, but that can't be for every project or business. Right?

Please enlighten me.


One of my biggest fears. What's to prevent trolls and competitors from just spamming your endpoints in a loop? How do people using pay-per-use infra deal with these problems?

I really want to use Lambda for public endpoints but it just scares me.


There are a lot of tools in AWS to deal with this. Api Gateway Rate Limiting, WAF, etc.

That said it does take a level of awareness to set these things up. Some of what we try to do at SST is turn on these for you automatically so you're not being punished for not knowing something


What’s even more worrying is the numerous accidents hiding in a $700k a month bill which is our problem.


That is a whole new class of problem. Better start parsing those CSV details billing logs for some gems!


I started off using Lambda as well and made the same sort of mistake. I can't remember exactly how much my bill was, but it was enough that it would have drained all my savings and effectively kill my startup. AWS was kind enough to write off most of it.

We now use Lambda only for simple cron / background tasks, or consuming from Kinesis. We use ECS for everything else. ECS is nice because it's relatively simple compared to K8, but still gives the full benefit of running multiple containers on one box.

I wrote a blog post last year about our migration over to ECS and experimenting with various ways to cut costs: https://blog.bigpicture.io/how-we-cut-our-aws-bill-by-70/


On a side note, I don't know if you've already acquired any AWS credits. If not, Product Hunt Founders Club is a decent deal that will give you $5k in AWS credits. Between that and the no Stripe fees for 1 year, it paid for itself in no time.

https://www.producthunt.com/founder-club


Thank you!

I recently applied for AWS Activate credits because our startup was a part of YCombinator Startup School recently.


Thank your for ECS suggestion.

I am definitely considering it, but struggling to choose between ECS, Elastic Beanstalk or EC2.

My past experience with ECS was a bit frustrating because I was forced to use CodeCommit to deploy and I didn't liked that. I would prefer to deploy directly from CI, for example from GitHub actions.


ECS runs on EC2. You basically register available servers and ECS automatically puts containers onto the instances where space is available.

We have it setup with Github actions to automatically deploy to ECS as well.


I think I should definitely consider it. I was using ECS Fargate on another project and it was not using EC2.

Why you choose ECS with EC2 instances instead of Fargate?


Good question. Tbh I haven't looked into Fargate too much. For our use case, we're processing millions of requests every day. So we needed more control for performance and cost. We're also fairly comfortable with lower level stuff and use Terraform to manage the infrastructure.


reminds me when I set up an S3 triggered lambda function that also wrote into the same directory. What ensued was millions of files and folders generated recursively.

Fortunately, AWS was kind enough to reverse the billing. Had this been Google Cloud, I would've gotten the cold shoulder and a low key threat that if I reverse the transaction I would lose access to my other paid Google products outside GCP :/


With S3 you don't even need lambda to do something like this

S3 has some setting where you can log activity on a bucket into another bucket

But that setting allows you to set the destination bucket to be the same bucket that you're monitoring. So ~30s after something happens on the monitored bucket, S3 writes a log into the same bucket. And then that activity triggers the logging again. So every ~30-60s, forever, there's a little log written into the bucket.

It takes a while to add up to something noticeable if your monthly AWS bill is already a few digits long. It's super fun to sift through the bucket a few months later when you're trying to figure out if there's any real data in the bucket or just endless logs.


I've had an horror story with Google Cloud, and they have been very helpful.

After I explained my situation (a $5k bill for an inactive side/toy project is extremely painful), and I provided great details about what I think happened to cause it, they wrote off the charges.


Sounds like AWS might refund you based on other responses, but always be prepared in case they don't. Make sure your DNS provider is not the same entity as your hosting provider, so an unresolved bill doesn't result in your domain being held hostage.


At work, my development team is contracted with a company that uses AWS, and for better or worse, we have also become the devops team. We have been burned by AWS before, and we have a rule of thumb: if you are deploying new functionality/service communication, after deploy, monitor for 10-15 minutes, with a wide enough window to see if there is a noticable/unexpected change from before the deploy. It always feels like wasted/burned time, but better to waste time than money. AWS is good about reversing accidental charges, though, but life is always easier if you don't even have to contact support.


> if you are deploying new functionality/service communication, after deploy, monitor for 10-15 minutes, with a wide enough window to see if there is a noticable/unexpected change from before the deploy. It always feels like wasted/burned time, but better to waste time than money.

... have you considered automating this? Alarms are pretty straight forward across all cloud platforms. Since you're using AWS: CloudWatch has anomaly detection. I haven't used it personally but perhaps it's worthwhile to look into: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitori...


We have alarms, but things like cloudwatch can be up to 5 minutes before an alarm goes off.


Really wish the major clouds had a “I’m experimenting. Kill it if it goes over 1k usd” mode. Not some crappy alerts - actual halt (and delete if necessary)

I bet that would increase profits too by encouraging people try experiment more


According to other threads here Azure has this mode.


Not to my knowledge. Certain types of accounts that come with credits have that as a hardcap, but its not something that can be enabled as such

https://docs.microsoft.com/en-us/azure/cost-management-billi...


This is a well known but poorly publicized issue with Lambda -- that they can get stuck in infinite loops and run up your bill.

I advise anyone I work with that if you are calling one lambda from another anywhere in your system, you should generate a request ID with every inbound request and then pass it along with each call as part of the context, and then error out if you see the same request again.

The good news is that AWS is aware of this and that their alarms are delayed, and will almost always waive the fees for you if you ask.


Open a support ticket with them and they will likely forgive this charge as a one-time gratuity


I’m not sure if Lambda@Edge supports reserved concurrency like standard Lambdas? For standard Lambdas I always set a low reserved concurrency to prevent these situations, deliberately throttling my max executions. Start with the lowest number and if you frequently see throttling you can then increase it.

For a three person startup… I find Lambdas/Serverless/Cloudflare workers fantastic if you are clever with how you architect things, they can really bring the costs down to trivial amounts. However you need to be defensive I.e. reserved concurrency etc and clever normally means lots of extra steps/hoops to work around restrictions or use things ingeniously. It’s fun if you like that, if we are honest with ourselves it’s hard to beat the simplicity of a VPS and a full stack framework like Rails or whatever alternative in your language of preference when just starting out, then break specific functionality out in to other architectures as required.


Self-host and stop using the cloud, especially when you clearly don't know what you're doing.

> Now I am waiting on a response from AWS Support on these charges; maybe they can help me waive part of that.

Honestly why should they? They're very clear about their billing policy and pricing, the only reason they might is good PR/karma from yc posting...


> the only reason they might is good PR/karma from yc posting

not really, they've been known to do it before, I personally got a much smaller charge (~$12) waived by just asking nicely.


That's directly akin to getting the good stuff until you're hooked...

Throw away the account, rent a bare metal server or even a self hosted vm if that's too expensive and you have far more freedom to learn things and do it properly.


I deleted the account right after I got that waived.

And I do have a server (or well two, lol)


First, based on my experience, they usually are able to waive this, if it's an honest mistake and from a small company.

Second, you can apply for credits, you can get at least $1,000 founders credits for AWS Activate, and if you work with any VC, you can get up to $100,000.

Lastly, note that Lambda@edge is (much) more expensive than Lambda, the tech stack I personally pick for any new product, is pure old lambda, serverless-express, and good old React / Vue for the frontend with static assets on S3 (with CloudFront and Route53 for custom domain, ACM for TLS). I'm not that familiar with NextJS but I assume it's mostly used to support SSR right? I would look into the tradeoffs of using it with non edge lambda, vs ditching SSR and using Lambda for API calls only.


I keep my fingers crossed!

We also applied for AWS Activate sometime ago, as a member of YCombinator Startup School.

I would really like to deploy to regular Lambdas, they are per region and easier to monitor, but unfortunately there is no support for this in serverless-nextjs right now.

Good thing about is it almost automatic - static frontend, SSR and API are deployed appropriately and work perfectly, without too much fuzz.

I try to balance between bringing customer value and doing infrastructure work, and now it is clear I should have spend more time on making a better architecture.


So the whole point of AWS is you can scale up on demand. The point of it is there's no capital expenditures so you can scale up crazy fast. More scale more bills. Bills on demand.

Now, there's two sides to the second part of this. The following is a tiny bite of the fruit of the Tree of the Knowledge of Good and Evil.

The good is that Amazon has a policy of leniency. So writing them personally and explaining and asking will probably--and for publicly viewable problems I've seen them come through. For me personally too, now I've also entertained a bad impression of them because they sold me some counterfeits, but one of the items I was sure was a counterfeit and which because of this whole crazy thing on the mail and on the phone they ended up forfeiting, turned out to be genuine compared to the genuine item from the maker itself. I could never distinguish the items in any way other than context.

But the ill part, and I considered posing this the other way, ill before good, but in this case it's good before ill, is that they actually have to make a profit at some point. Now they make a lot of profit, but this doesn't change that, the basic thing in business is that it's not just granting, it's charging for what you grant. Both. It's not only about survival, it's also about integrity, because if you never charge you starve, meaning in practice you debase your values until you can eat.

Which brings me back to the main point, which is this: they used real energy they had to really pay for, and real hardware that depreciated, and you got the code wrong. Now Amazon I've heard doesn't have training wheels, you want training wheels. Like if you play with assembly, you can fuck up your computer, that's $2000, and since here you don't know what you're doing, they will tell you to like replace the logic board, do this whole thing, computer repairmen for sure screw people.

Just like riding a bike, you ride with training wheels until you go through the pain of learning to ride for real, and you fall, and you scrape your knees, again and again until you finally ride for real. And you keep falling off your bike indefinitely, just less.


Totally agree. I think of about it too compared to the cost of a semester of useless college classes and the guaranteed loss of thousands of dollars.

I opened a separate bank account with $2k in it that is the cost of learning cloud computing and for piece of mind if I do something crazy.

I am also though not leaving anything running ever when I log out. No way I am ready to run a lambda function in production.


Now that my bill was finalized for June 2022, I finally got an one-time refund from AWS.

Here is a follow-up story about this and how I decided to change my cloud stack to avoid this in the future: https://medium.com/@ruslanfg/aws-refunded-after-i-ddosed-mys...

Based on comments here, I decided to go with AWS ECS in the future.


I very strongly believe AWS needs dead simple dummy-proof budget controls. 15 years and it still does not have it.

And with simple I mean this: "when this service hits my daily/weekly/monthly budget cap of x$, STOP IT AUTOMATICALLY"

Individuals/freelancers, hobbyists, small startups need this. When I started in this industry and write an endless loop, I'd hang the server. Now you hang your future.


Try filing a support ticket with them with this information. I had something vaguely similar happen on GCP and they refunded the full amount


Hi, author here!

Thanks, already done that, and waiting for it to be reviewed by AWS. Support was very responsive, though.


Make sure to set up budgets too! Support will gladly help you with that since it helps prevent this situation in the future :)


Good. Now you have learnt a great lesson, and you also realize that production mistakes can cost quite a lot in a short amount of time.


Consider it an expensive lesson on not giving your credit card to those scammers.

Aws model is incredibly predatory. All the companies I've seen move to AWS ended up spending way more than before for much less gained. And they still needed an ops team, just more expensive than the sysadmin they had before to manage actual servers.

I'll stick to VPS


I stumbled upon this post the other day, maybe you'll find it helpful: https://www.lastweekinaws.com/blog/an-aws-free-tier-bill-sho...


Speaking from my own experience, with a much, much larger bill: AWS support will most-likely ask you to set up the cloudwatch alarms and billing alerts properly, and to provide a plan to mitigate further issues. Once you've done that, they will be more inclined to reduce the charges.


If you contact support via your Amazon account and explain your error they will often remove some (but usually not all) of the bill.

Sorry that happened, always one of the scarier parts of using AWS. This sounds like an especially tricky one with the standard billing alerts not even catching it.


Thank you!

I already contacted them, my past experience with AWS was pleasant, it is just this CloudFront delayed billing should be better clarified in the docs I suppose.


For the future, I’m not sure if it suits you but I find you can get a far easier, cheaper, and more predictable dev and deploy experience without using something like CloudFront.

I once moved a project from AWS to Digitalocean for a small team (AWS was just all they knew, so it was what they used) and I was able to cut the monthly bill down quite a bit.

It isn’t that DO is inherently cheaper or better. It’s just dead simple so it’s easy to deploy with only what you actually need, with easy limits and visibility on what gets spun up. In some cases it’s arguably less cost efficient, but it’s really hard to mess up.

For the team I was supporting, simply having the visibility and a simple tool was worth a lot in saved time. They previously spent way too much time on AWS, and couldn’t even get the right infrastructure with the time they invested.

So, maybe something worth considering at least. Good luck with the bill! You’re certainly not alone (I got an $800 charge for a db I forgot to kill a few years ago).


Given that you thought you were doing your due diligence and had set up billing alerts, but their billing alerts are incomplete -- they should be on the hook to give you a one-time pass on that particular failure mode.


This is the exact reason I fear AWS. How should I go about learning it when mistakes like this can basically ruin me for the next few months? Not sure if there are safe resources or Free Sandboxes somewhere


Last weekend I went on a code marathon doing a bunch of CDK AWS stuff.

Soon as I saw this headline, I was completely shocked and dropped everything to check my usage. Luckily I had 'cdk destroy' the big projects, but I had dozens and dozens of lambda and S3 buckets it failed to clean up. I spent probably 1 hour clicking through the web interface deleting them in fear of what happen to OP

I think I'll just stick to LocalStack and VPS.... actually my datacenter friend said I could bring up my deep learning station to this DC might just abandon AWS idea.


AWS gives you free resources to start with. They just aren't capped, so you could bankrupt yourself. They also seem to be good about fixing honest mistakes, but it's scary and long.


You might want to try our CloudMarshal App. You can use it on a free trial here: https://app.cloudmarshal.co/


You could just as easily have a "standard" setup recursively calling itself, and scaling up multiple instances as a result. This isn't an issue due to serverless.

AWS will hopefully issue a refund.


It is not easy to catch self calling functions from a static analysis :(


You might want to try CloudMarshal - you can can trial it for free here: https://app.cloudmarshal.co/


> Would you recommend to use to build a new product if you are bootstrapped, 3-person startup?

No. Digital ocean droplet + scp is all you need. Hell, even sftp would work if you are less than 10 people.


It's a website. No matter what you call it, the end result is it's a website. Some HTML and a gif or two. Maybe a bit of javascript.

Why are you making it so ludicrously complicated?


The cloud is way over-used and over-relied upon. Most companies could run on a cluster of raspberry pis or equivalent single computer before they need to worry about the cloud.


With all these stories about unexpected bills from cloud providers, are there cloud providers offering services with pre-paid credit instead of ex-post invoicing?


I feel like once a day there is always a cautionary tale from some absolute dumbass getting reamed by AWS.


I once made a lot of web requested with lambda in a VPC through a NAT. Cost me similar amounts...


maybe doing recursive calls on lambda was not the best course of action?

I've been in teams making dozens of lambda based apps without issues,

lambda itself is the smallest item on the bill.

API Gateway is multiple times more expensive than the compute (lambda) fees


At my wife's company they had a similar issue with a 25k bill. Ouch...


We're looking at going to a NextJS frontend setup soon and maybe Lambda for SSR... Would Netlify be a good use case to avoid situations like this and keep the hosting bill down? Netlify CMS really has our eye after chatting with the Plausible.io folks.


Wow, cloud. The future!!

I bet your glad you gave aws that money and didn't just invest it in your own hardware.

Serverless is just somebody else's servers.


Yeah, lesson learned.

Don't ever use AWS unless you have a limited card (e.g. Privacy.com) and are willing to burn that bridge.


Just RTFM next time.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: