This is a great idea. I setup something similar for my brother's website contact form (host the site on S3 and then use lambda to shoot him an email when someone submits their data on the contact form). The costs for running his website are just a few cents a month (above the $0.50/month charge for hosting on Route 53).
For those who work in Python, there is a project called Zappa (https://github.com/Miserlou/Zappa) that allows you to take an existing project in a framework such as Flask and "host" the whole thing with lambda. The project is still young, but I imagine this sort of thing could be very useful to people doing a hobby project that don't want to pay for full-time hosting, but want their site to be available full-time. Hopefully, we get to a point soon where this sort of framework is ready for production use, though I don't know that it's there yet.
Glad you found it useful. The great thing about Lambda, I think, is the cost, which is so cheap as to be effectively free for a lot of uses (according to https://aws.amazon.com/lambda/pricing/, the first million requests each month are free). So if you could design a web service to be deployable on lambda (via zappa or whatever framework), you can effectively host a website for close to no cost assuming you don't have a lot of traffic. On the other hand, you lose some of the control that comes with having your service run on a EC2 instance. For example, I like being able to SSH in to the server to troubleshoot things when they go wrong. But for a hobby project that I put together in a few hours, I might not care enough to pay for the EC2 instance.
Don't you need to use the API gateway on top of lambda? The pricing there seems like it doesn't go low enough for a lot of uses (1 million requests for $3.50 is fine, but if I only have 10,000 requests a month I am paying 100x what I need to). Or am I interpreting the pricing wrong and that 1 million requests could take 2 years to happen? Amazon pricing is way too confusing.
'The Amazon API Gateway free tier includes one million API calls received per month for up to 12 months. If you exceed this number of calls per month, you will be charged the API Gateway rates.'
Same as lambda. So together, first million requests are free. After a year, you'll pay ~$4 a month it looks like, for lambda (40 cents for the first million if you stay under 100 ms per and 128 megs RAM; 20 cents for the function, and 20 cents for the processing), API gateway ($3.50 for the first million calls), S3 storage (a gig is 3 cents), and traffic (I'm assuming small amounts of traffic, obviously). $4.50 if using Route 53, and $5.50 ($12 per year, paid up front, but I'm averaging it out) if you get your domain through AWS. That's not bad for site hosting + comments.
I do wish they'd enable the Amazon Certificate Manager to be used on the API Gateway though.
For hosting a few dozen API requests a month? Yes, massively too high. Compare to the rest of it, that $3.50 is more than everything else involved combined. For that price you can just run wordpress on a VPS.
We use Lambda for our contact form; it's great. Our website is completely hosted on s3/cloudfront so it's much cheaper and stable than running a dedicated machine.
If I get some time this weekend, I'll clean it and push it up to github.
[EDIT]
Here it is. This has been working for us for almost a year. If there is interest, I can clean it up.
I've been considering adding comments to my static website for a while. I already deploy the website over IPFS, so that [in theory] the entire website can be distributed, and it can be duplicated/cached, etc when I have a spotty internet connection. Ideally, I would also have the comments published directly on IPFS as well, rather than having the comments be retrieved from somewhere external to the IPFS network. Your code should let me deal with that - just have the comment system fire off an email to me when a user posts a comment, and a daemon running on my PC can publish the comment next time it's online. Plus, I get easy email notifications :)
Such an approach seems a bit convoluted though. I'm not used to web programming, wherein it seems like every popular approach to hosting something as simple as a blog comes with pretty significant drawbacks or limitations. I'm much more fond of working in environments which offer an objectively "right" way to do every task.
How does it work? You POST to the lambda service, and it emails the url parameters with the specified formatting? This requires a separate HTML form to be served from your website first, right?
Yeah, I didn't really make that part clear. You have to create an API Gateway.
Luckily, AWS lets you share API definitions so I added that to the repo. It's the swagger.json file. I'll try to get this all automated to make it super simple, but for now you just have to replace all those placeholders.
Love the idea. For those who aren't as Lambda savvy you might want to provide some "real world" information on pricing. Ex: This code takes roughly 500ms to run, which means you should expect to pay $0.10 per 96,000 comments on your blog.
I know it's stupidly small costs (Lambda is wonderful for that), but some people still don't understand how to translate the pricing structure to "the real world".
Looking at the CloudWatch Logs, each comment seems to take approximately (Billed Duration): 400ms for the initial request, and 800-1000ms for the worker to concatenate the request into the published JSON.
Lambda doesn't destroy the instance at the end of the response. It keeps it running just in case other requests arrive. If that happens you might want GC or a worker eventually runs out of memory.
What could happen with Lambda and similar services is that costs will take back under the spotlight "old" languages that run with little memory and little CPU cicles. I'm even thinking about C. I'm not into Go, but that looks like another candidate.
Scenarios: those ubiquitous web services running on a single VM because it's more than enough. The customer has to pay for it all the time and could save money if the app can run on Lambda. Enough to pay more in developer time?
What surprised me with Lambda is that it 'freezes' the VM your code is running in, and 'unfreezes' it when more requests come in. Very clever - they can store a lot of frozen VMs on an SSD. It surprised me when I discovered that a cache I had written in my code was already populated!
I think most code running on lambda is not going to be CPU bound. In lambda-comments case it's mostly going to be idling as it waits for TCP/IP traffic to/from Akismet, S3, DynamoDB, Slack notification webhooks, etc.
Have you been able to quantify the exact pricing per month?
From the looks of the git repo it's at least 65 cents per month but then there's a number of other services with no price reference.
What would you estimate it would cost per month to host a blog that gets 50 comments per month and what would the cost look like at the end of the year given costs would rise over time due to needing to store more info.
For argument's sake, let's assume no AWS free tier.
I don't have "real life" numbers yet, since I just set it up yesterday. I've used up my free tier, so I should be able to get real numbers from Cost Explorer. I'm not sure if everything is tagged nicely for that.
On paper, it should cost under a dollar per month. The main cost is just the DynamoDB instance. I don't think it's going to consume much S3 storage.
I guess if your blog has a lot of traffic, the network transfer might add up -- but it's just little JSON files.
The numbers might be skewed upwards a bit because somebody posted a gigantic blob for one of the comments to test if I had implemented a limit on the length (I hadn't).
Thanks for the solid numbers based on real data. At the time of this posting there's 103 comments too which IMO seems waaaaay high given 8k visitors due to the nature of the post.
Hard to say for sure but I wouldn't be surprised if a normal blog received 5-6 comments per 10,000 views.
Great work! The fact that data is stored in my account (to be read: under my control) instead of outsourcing to 3rd parties is a huge plus.
If you will be able to make it less "technical" and more "plug-and-play", I think this will be cool viable solution to integrate (to be read: for customers like ours, that uses serverless computing in enterprise apps -> https://github.com/MitocGroup/deep-framework).
The statement in your website "On some level, the era of personal websites is over" made me sad and nostalgic. I'll keep it old-school by maintaining my own personal website.
I don't think that's true. It might at least be accurate to say there's an end to open comment sections and other abuse magnets.
Anybody can still make a site in a few minutes on a number of places, they just aren't going to solve the abuse problem themselves without using a hardened tool.
I just finished setting up my personal site with Hugo+S3+Lambda yesterday. Lambda runs Hugo when the S3 bucket detects a file change, takes about 5000 ms per blog post which is fractions of a penny.
Although this adds a bit of complexity compared to Disqus it sounds like a fun solution to implement, potentially with the addition of a way to moderate comments.
I think the serverless(ish) space is headed in an exciting direction overall, especially if we can drastically reduce wasted compute time. I'm watching your repo and hopefully will get inspired! Thanks for the reply Jim
Did you follow a tutorial by chance? Moved one of my sites over to GitHub and Hugo recently and the ease of Hugo has made me seriously consider moving as much as I can of my other stuff.
Of course there are servers out in the cloud somewhere. But you're not renting them 24/7.
I'd love to experiment with a completely server-free P2P system, perhaps using WebRTC and WebTorrent (although in that case, there are still some servers involved).
"Serverless" is the trendy term that the community is describing this particular architecture
Describing it extremely poorly, quite frankly. Also - this project is not just consuming resources from the totally not server based resources it inherently relies on, a key component of it runs on those totally not servers.
After reading a little on the pages you linked to, "Server-less" is just a 'cute' way to say "our app is vendor-locked to AWS and their Lambda/API gateway/hosted DB/hosted storage/etc services"
> Of course there are servers out in the cloud somewhere.
Trying to explain a trendy mis-used term with another trendy mis-used term doest't help your case.
Have you ever heard the phrase "there is no cloud, it's just someone else's computer"?
> But you're not renting them 24/7.
So, any application where the server-side component is scaled up-and-down dynamically, based on some kind of workload based schedule, is server-less?
Yes, it's a service that uses its servers to run your code. Just like how most of us have somebody else run our email for us. There is still a mail server sitting out there somewhere masquerading for our domains, but we don't run mail servers. Now there is a service that is running my node code for me.
Well, it's the first time I've come across this usage of the term. It's pretty misleading when there are serverless technologies around, e.g. P2P networks which don't make a client/server distinction, JS applications which can run from a local HTML file, etc.
Okay, let's try another analogy. Let's say I don't own a car. Or lease. Or rent. Or borrow. That makes me "carless", right?
Moving on: I only take taxis or Lyft or whatever to get to my destinations. That would still make me "carless", right? I don't manage the fleet of cars on call for me. I don't have to worry about their maintenance. I am not responsible for a car, so I am carless.
Claiming that the comments are "serverless" is the equivalent of claiming that the ride is "carless".
You, the person being driven around, can be "carless" and you, as a site owner, can be "serverless". For the ride to be "carless", it would have to not involve a car. For the comments to be "serverless", it would have to not involve a server. This system involves a ton of different servers, so it can't be "serverless".
"I don't have to deal with managing servers because I'm paying for a service which abstracts the underlying servers" has been the value proposition of PaaS systems from day one. It doesn't make them "serverless", which is a silly new empty buzzword.
Why do you feel the need for yet another marketing buzzword ? IMHO saying that makes one sound like an someone that don't know what a server is, but maybe it's just me. Just because you're not managing a server doesn't mean it doesn't exist, it obviously does, the code isn't running on each client but remotely.
> Just because you're not managing a server doesn't mean it doesn't exist
Wireless internet is still connected to the rest of the internet via a wire somewhere; the obvious point being made is that your phone or laptop doesn't need a wire. Serverless here doesn't mean that servers no longer exist, but that you are not working at the server level to accomplish what you're trying to do.
Wireless is accurate for handsets, since you're not using a wire to exchange data, the communication is wireless.
Serverless is totally inaccurate for scripts uploaded ON a server. Since they are executed on a server. The scripts are executed and its result are potentially served from A SERVER. The fact that you don't manage it yourself doesn't change that.
That's an horrible example you are using to try to make your point, it just doesn't work.
Understood, but that means that Rails has been serverless almost forever because of Heroku. The console is simpler than the one of AWS and you deploy with git push. You didn't have to worry about the db because you only had PostgreSQL for a long time. And you could buy add on services to do almost everything. It just happened before the buzzword was born.
> Wireless internet is still connected to the rest of the internet via a wire somewhere
Wireless refers to a specific component of the connection - the the connection between your device and the relevant wireless base station.
No one technical is claiming "wireless internet" - and if they are, they're as guilty as the server-less crowd - they're claiming a wireless local area network (wifi), or wireless wide area network (3g/4g/WiMax, etc) - they're not inherently "the internet" they're just a radio based method for transferring packets of data.
Here, you have developers literally uploading the code they've written ( the javascript code that runs on AWS' 'lambda' service ), configuring 5 separate rented services and then claiming "look ma, no server".
We don't choose the buzzwords - the market (or marketing people) do so. The word has already been chosen - "serverless". I agree there could have been a better word that didn't clash with our knowledge that there are still servers involved.
After someone posted links, I went and did 30 seconds of cursory reading.
The three referenced pages are:
Server-less conf - a conference about building apps on AWS lambda
Serverless blog - a blog about building apps on AWS lambda
Serverless framework - a framework for building apps on AWS lambda.
Every single "server-less" thing I can find, comes back to "using AWS Lambda". Which then makes me wonder why the need for a buzzword like "Server-less" - it's not like it's trying to describe a range of approaches and technologies (like say Web 2.0 was doing).
Ohhh of course. How stupid of me. It's deliberately vague and non-specific. You can sell a client on "we use a server-less architecture" because they think it means literally that - no servers, just computer guy magic. Is it as easy to sell a client on "we built the app to rely completely on 5 different services from a single company that has a history of anti-competitive behaviour"? I doubt it.
I see it all the time now when referring to Lambda or Azure Functions or Google Cloud Functions. I'd claim, in fact, that there is no other one buzzword which is currently used.
It's a straw man to assume that "serverless" refers to physical servers. The argument is triggering my irony detector, since that usage is itself a marketing category so broad as to be meaningless. Only in the client-server model does "server" have a definition worth reasoning about, where "server" refers to the passive listening endpoint in that model.
Lambda's published reference architecture is asynchronous invocation via queues listening to pub/sub events, which in integration terms is the antithesis of the client-server model. There is no listening server in the client-server sense.
So whilst the term "serverless" has obviously been chosen as a catchy marketing term for the mouthful that is "asynchronous event-driven self-scaling platform-as-a-service", it's not wrong.
> Just because you're not managing a server doesn't mean it doesn't exist
Of course that's true, but "serverless" implies that you don't have to care about the "server" part of your application -- that is the stuff that isn't related to solving your problem and you probably don't care about anyway.
I like the term as it embodies what PaaS is supposed to be about: not paying a set fee per month for a server that you only use a couple of hours here an there but rather on a usage basis.
"Shared" web hosting services haven't needed patching or maintenance from the developer since the mid 90s. Is that server less?
Hell, think about companies where infra is managed by a dedicated team of skilled experts. The developers there aren't patching or maintaining the servers.
Lambda is great. Few friends and I just put up what we call LDAN (Lambda, DynamoDB, Aurelia, and Node) aka Lieutenant Dan. Pretty cool refreshing change that is in stark contrast to our larger EC2+MongoDB stack.
I went for the demo and I stopped right away when it asked me to login with Google or Facebook. I went to the repo and it's very light on documentation. At least I learnt about aurelia.io
One step closer to my dream of "Wordpress on Lambda"!
What other projects exist to run a blog on Lambda? Apart from going full in on a locally-compiled static site that I just upload to S3, that is? (Hugo, Hexo, Jekyll, etc) I'm imagining a full admin web interface that uses a Lambda-driven API, themes, a front page with dynamically generated content from the Lambda-driven API, etc.
Zappa is pretty cool, I haven't had a chance to use it myself but want to find a project to apply django-zappa ( https://github.com/Miserlou/django-zappa ) on django-cms or wagtail... not too sure how that'd work out though.
Zappa author here - Wagtail works great on Django-Zappa. I have also been slowly writing my own CMS, https://github.com/miserlou/zappa-cms, although that's not ready for public consumption yet!
As you describe the goals and features of your system, I think the AWS solution is too complicated, even over-kill, for what you are trying to achive. I use AWS extensively for the 'big projects', I also use it as a CDN for all projects, but for the small projects (in terms of complexity and capacity), I find it much simpler to implement a two-tier service. For example, for comments (form submission in general), it is much simpler to write a node app, host it on a free-tier and persist the data also on a free-tier (for example Heroku+Firebase). Parse.com was great b/c you could do it all in one tier since the offered server side logic.
Granted, if you ever pass the free tier limits, normally the price breaks of these PAAS providers are high so cost does not scale 'by the cent' like AWS; however, if you pass the free tier you must be making good traffic which should translate to good income so infra cost shouldn't be a problem.
One of my motivations for doing the project was to just learn how to provision a bunch of AWS services and make it reproducible.
I might write a little shim so you can also just run the API as a standard Node.js app. At the end of the day it's just a JavaScript bundle and a single REST API method.
Even with a service like Heroku, a choice needs to be made for what service to use to store the data (eg. postgres). Heroku's free tier is almost like a course-grained Lambda because the VMs spin down after a period of inactivity.
I'm actually really excited about https://zeit.co/ - check that out.
In the long run, the hosting cost for this type of application should be pretty close to zero. People will choose their cloud/hosting solution based on other factors.
Downvote all you want but IMO this is a Rube Goldberg machine that uses 5!!! AWS services to persist what? 5k comments for the entire lifecycle of the website?
The modern programmer is quite good at playing Legos with the myriad of micro webservices out there (which is good play/training) but has forgotten how to think like an 'engineer' must think.
Any working solution actually has thousands of components.
I'd defend this particular Rube Goldberg machine by pointing out that 99% of the time, none of my code is running on the server side - it's all Amazon S3. The only time my code runs in the cloud is when a comment gets submitted.
I'd argue that it's pretty easy to reason about. I've got some more blog posts coming. :-)
Don't get me wrong, what you've done is how we will build web stack solutions from now on: orchestrated microservices. My hopefully constructive critique is that as engineers we ought to find the best solution for a given problem, even at the expense of such solution not being universal. Take the myriad of tiny to medium websites that use WP, it is like renting an 18-wheeler to move one tiny box. The engineered correct solution would be a static site. Now, for a site with just a handful of posts and comments we should think of a much simpler solution for comments persistence.
In fact, after reading about your software I am inspired to think and try to come up with such solution, let's see :)
Very interesting! I'd like to migrate off Disqus, but I'd like to not leave my old comments orphaned. Do people know of any export services? (This may be an ill-posed question since comments arguably belong to their authors..)
Even smaller established websites get a ton of comment spam these days. I'd not even consider setting up any commenting system that doesn't have robust spam protection.
I have a little bit of code similar to yours for comments, I just wrote some glue to require Google or Facebook authentication through AWS Cognito to make comments. Some people will hate that, but it worked.
I've already had to edit the JSON file on S3 to remove/edit a couple of comments. That was pretty easy to do from the command-line, but I think most people will want a web interface.
Pricing for AWS Lambda is complicated, but reading the examples clarifies things... a bit.
Is The Lambda free tier something separate from the free tier you get on first signing up for AWS? I used up my AWS EC2 free tier years ago and I don't think you are eligible again once you use your free tier. I wish they'd put regular pricing at the top instead, i.e. for someone who doesn't qualify for the free tier you start paying from request #1.
I wish there was away to get stuff like this to work in a way that didn't take an all-or-nothing approach to noscript-filtering. As it is, now I have the option of no comments, or allowing scripts from: "s3-us-west-2.amazonaws.com", which almost might as well be "anywhere and everywhere". FWIW I see a similar problem with cloudfront and CDNs in general.
Yep, I started it before that. I'd love to lessen the dependency on Babel/Webpack as much as possible.
I'm still using async/await, ES6 modules and some other more experiment ES-next features, so I think it will be awhile before Babel/Webpack can be completely removed. Of course, I don't really have to use those features. :-)
If the babel/webpack builds happen before deployment to Lambda, I'm not sure it really matters at this point... once we see Node > 7 or so drop, with ES6 module support, it might then be time to consider dropping babel, but it's going to be around a while, and still nicer in a lot of ways than alternatives.
I'm not really using DynamoDB to store data. I'm using it to sequence parallel requests going into the 'worker' that concatenates things together -- so two simultaneous incoming comments don't clobber each other.
At the moment, I think only DynamoDB Streams and Kinesis event sources will batch events together and deliver them to a single function (instead of many parallel functions). It's hard to understand just by reading the docs.
I think the platform is evolving quickly, so I wouldn't be surprised if they come up with some more options in the future. I'm using DynamoDB streams since it's cheaper than Kinesis.
The problem with DynamoDB is it has minimum provisioned throughput requirements that would make this rather expensive for a blog that was not free-tier eligible even if it got no traffic at all.
In contrast, SimpleDB seems to only charge for actual throughput used.
You could have scheduled events -> lambda poll SQS.
Though now I wonder why not just do what you need to do right from API->lambda instead of intermediate batch.
The Concurrent Executions section of this page suggests that anything other than Kinesis and Lambda Streams will cause parallel work, which is not what I want:
I think it might be possible to embed lambda-comments into an iframe that could then appear on GitHub README.md files, etc. It's just an untested theory for now. ;-)
In the blog post, I do link to some projects you can run on your own server. Actually, it wouldn't take much work to modify the code in this project to run in a Node.js daemon. I might just do it anyways to make testing/dev easier.
Been working, slowly-but surely in terms of writing my own blog engine (static generator) and then publishing to gh-pages, but was considering using discus for comments... thinking of doing something similar as a docker contained service that I can throw up on my dokku server.
TBH, the thought of managing the attack surface of comments is kind of scary... Will definitely be referring back to this.
My approach was to just pump the entire comment through the "markdown-it" markdown processor, which promises to emit safe HTML. It's a popular project, so I'm banking on the fact that they go a good job of sanitizing in their pipeline.
For those who work in Python, there is a project called Zappa (https://github.com/Miserlou/Zappa) that allows you to take an existing project in a framework such as Flask and "host" the whole thing with lambda. The project is still young, but I imagine this sort of thing could be very useful to people doing a hobby project that don't want to pay for full-time hosting, but want their site to be available full-time. Hopefully, we get to a point soon where this sort of framework is ready for production use, though I don't know that it's there yet.