More

janoszen · on June 9, 2018

Hey folks, I'd really like some feedback on this before I push the final version to Maven Central.

janoszen · on Feb 15, 2018

A POP or edge location is a server (or multiple) that the user traffic is being routed to, hopefully close to the user. A CDN consists of multiple POPs, one in each region, with intelligent traffic routing added (as described in the article).

janoszen · on Feb 14, 2018

If you are fine with having slashes at the end of your URLs and you do not want to do anything too complicated like content negotiation for image types, S3 and CloudFront is fine. The moment you turn on Lambda@Edge, to do the magic, things get slow after a period of no traffic.

I plan on expanding on the featureset, so no S3 for me. :)

gunzel · on Feb 15, 2018

Did you consider using periodic calls to keep the Lambda@Edge functions "warm"? I've been playing with Zappa (https://www.zappa.io) for standard Lambda and it sets this up by default.

janoszen · on Feb 15, 2018

Yes, but it's kind of a whack-a-mole since their reuse times are not public AFAIK, so it would constantly need tuning as they develop the service.

janoszen · on Feb 14, 2018

It depends on the scale. Running a personal blog with sub-1MiB/s traffic is not a problem. I've seen some larger projects though where detailed data analysis had to be employed to debug bad connections... that's not a one-man-job and it was a serious headache to work around some of the less... neutral providers.

janoszen · on Feb 14, 2018

(I'm the author.) This whole setup is built for a comparatively low traffic blog, so DNS caching won't help much. (On normal days I get ~100 visitors.) This is compounded by the TTL which is 60s to account for node failures.

The optimization level is in the sub 1 second range, so not having to pay one large RTT penalty for a DNS lookup is quite important. I've measured 300+ms RTT to Australia on the previous box I was using, that impacted the load times quite severely.

janoszen · on Feb 14, 2018

I think SNI is fine, all modern browsers seem to support it: https://caniuse.com/#search=sni

janoszen · on Feb 14, 2018

Yes, but you would need a crawler that does so in every region, or at least know the IPs of the edge nodes on that CDN. You would probably also hit some rate limit / DDoS protection with the CDN itself.

janoszen · on Feb 14, 2018

Thank you to both of you, I've edited the article to clarify that point.

janoszen · on Feb 14, 2018

Interesting, although I specifically wanted to build a push CDN (where I can push the content) rather than a pull CDN (that works with an origin) to avoid the added latency with cache misses.

dzolvd · on Feb 14, 2018

Makes sense, I am enjoying looking through the source as we are moving to an ansible and hopefully dockerized deployment model.

janoszen · on Feb 14, 2018

Of course it's dockerized, it has to be cool, right? :)

Ansible is running docker-compose up -d on deployment an Traefik is doing the magic. I want to extend it to host multiple sites in the future. (Btw. Ansible ran from a central location is painfully slow because of the large latency to the edge nodes.)

The content itself is deployed using rsync, Ansible was just too painfully slow for that.

janoszen · on Feb 14, 2018

"When it comes to picking a solution, I often choose the less traveled road"

I forgot to add that this applies only to R&D and hobby projects, for production setups I'm a bit more careful. :)

(I'm the author.)

nathan_f77 · on Feb 14, 2018

Ah, that makes sense!

janoszen · on Feb 14, 2018

Thank you for pointing that out, I've updated my bio to reflect that. Hopefully this way it's a little less ambiguous. :)