Intro to Node on AWS Lambda for S3 and Kinesis

icpmacdo · on May 11, 2015

Is there a way to automatically kill Lambda if the cost is going out of control because of an infinite loop? As a solo dev I cant be willing to bet hundreds or thousands of dollars potentially lost because of a bad piece of code I wrote. If thats possible it seems like a nice service that will save a lot of time not having to worry about doing server administration.

My ask HN post from earlier today basically asking the same thing https://news.ycombinator.com/item?id=9520964

nicksergeant · on May 11, 2015

As far as I know, there's no way you can automatically kill it. Of course, you can setup billing / execution alerts, which would probably be very helpful in at least notifying you that there's a problem (I personally have an alert set up to notify me if my estimated charges are anything out of the ordinary).

A small anecdote: Lambda processes are _very_ cheap most of the time. When I was writing the initial version of this starter module, I accidentally threw the lambda into an infinite loop. It was triggered about 30k times within a minute or so and ended up costing around 13 cents if I remember correctly.

icpmacdo · on May 11, 2015

So it seems like Lambda is too risky of an option because of the potential for catastrophic failure. It would probably make more sense to try and work with something like Elastic Beanstalk or a lone EC2 instance Im trying to avoid a large amount of time being taken up by doing server administration. As I mentioned in the other post a new hobby tier Heroku Dyno and DB at around 19$ a month is possible within my budget if they make launching a node backend for my app significantly easier. I know this is getting off topic from the OP but any guidance from HN would be appreciated.

rattray · on May 11, 2015

How is that your conclusion? If you set up proper monitoring, you'll be notified and able to turn things off by the time you've been cost $10. I would hardly call that catastrophic...

icpmacdo · on May 11, 2015

If I missed the notification that an infinite loop was costing 13 cents a minute by the end of the day I am out almost 190$. I know the rational that Amazon has is that you would want to scale your service if you were adding hundreds of people in a single day but thats a lot less likely than an error in my code flushing a relatively considerable amount of money down the toilet. If that was not a risk though I would be using Lambda, it otherwise seems like my best option. Backend is not my forte though so if the point seems crazy I'm willing to consider any advice.

mryan · on May 11, 2015

AWS will send billing alert notifications via SNS. Theoretically, you could write a script to listen to the alert queue and shut down processing if you breach a pre-defined threshold.

Of course, you then need to make sure your billing script successfully terminates any processes that are costing money.

I'd love to see AWS implement hard cost limits for services like this. e.g. specify you want to spend a maximum of $10/day, and once you reach that limit your service is temporarily suspended.

rattray · on May 11, 2015

Fair. A response time of 1h15m really isn't something one can reasonably rely on =) I should have thought that one through a bit more.

untog · on May 11, 2015

IMO Lambda is too risky an option because it's compete vendor lockin. But I love the concept so much.

kondro · on May 11, 2015

Lambda functions can execute for a maximum of 60 seconds.

nicksergeant · on May 11, 2015

But, if you have more than one Lambda working on a single S3 bucket, for example, it's possible to trigger executions infinitely.

Edit: my mistake - you still can't have multiple lambdas operating on a single S3 bucket. I had triggered an infinite loop by manipulating files in the same bucket that it was triggered on (triggering another execution, infinitely).

helper · on May 11, 2015

Last time I checked, you could only have one Lambda function attached to bucket events at a time.

nicksergeant · on May 11, 2015

Yep, but if your lambda creates a new file in that bucket, it'll get called again (as an example).

Edit: realized my mistake in the wording of my comment above - _how_ I made the infinite loop had escaped me for a moment.

res0nat0r · on May 11, 2015

I believe when you upload your code it will throw an error that your source and destination bucket are the same, which will prevent this type of looping.

gamache · on May 11, 2015

This is not the case. I've written a Lambda which quite purposefully writes to a different path in the source bucket.

I have a Lambda for processing large files; the first phase is a single Lambda splitting the file into chunks which can be processed within the 60 second limit, and then the second phase is a bunch of the same Lambdas processing those chunks.

(Disclosure: I work with nicksergeant at Localytics.)

nicksergeant · on May 11, 2015

This wasn't the case when I was uploading mine, but maybe they've done some sort of smart checking on the execution role (it'd be pretty hard to detect just by looking at the code, I would think).

dorfsmay · on May 11, 2015

Huge drawback... I wanted to use AWS lambda for some long running task and run into that limitation.

bpicolo · on May 11, 2015

So use them to launch the long running task

robbles · on May 11, 2015

Since the Lambda functions are written in Node.js, you can probably take advantage of the way exceptions are handled and give your code a default timeout:

  setTimeout(function() { throw new Error('timeout'); }, 1000)

raingrove · on May 11, 2015

No. Javascript is single threaded. So if the code is stuck in a synchronous loop, this won't help. (Of course there are ways to make the loop yield and not be completely synchronous.)

robbles · on May 11, 2015

Ah, didn't realise this comment meant a synchronous loop and not an async deadlock. You're right, this won't help for that.

kennu · on May 11, 2015

I wish Lambda could listen to arbitrary TCP ports and respond e.g. to HTTP requests or MQTT messages. In the current form it seems pretty limited for any scenarios where generic clients initiate the requests, such as collecting device data and metrics. As I understand, currently the clients have to speak AWS-specific protocols to submit events / data.

justincormack · on May 11, 2015

The impression I get is that it is not really realtime, I am guessing all requests get queued and then scheduled on to a suitable host soon, but perhaps not within the time to respond to a TCP request.

(Although custom events http://docs.aws.amazon.com/lambda/latest/dg/walkthrough-cust... suggest its pretty close to real time; should measure the latency).

cddotdotslash · on May 11, 2015

I really wish AWS would open up Lambda to more events than just S3 uploads and SQS / Dynamo queues. There are so many workloads I could move to Lambda if I could just invoke it via an API call.

cherioo · on May 11, 2015

You mean this? http://docs.aws.amazon.com/lambda/latest/dg/walkthrough-cust...

nicksergeant · on May 11, 2015

You could just write some code to manually invoke the lambda with specific data using the SDK, no?