How Uploadcare Built a Stack That Handles 350M File API Requests per Day

nametube · on Aug 28, 2017

User generated content (especially images) are a great attack vector, what do you do to isolate/mitigate against attacks like that ?.

BillinghamJ · on Aug 28, 2017

How so? Obviously they won’t be executing (or likely even analysing) any of the uploaded content.

Similarly, browsers should not generally be particularly vulnerable to malicious content being loaded with appropriate MIME types in appropriate containers (e.g. <img>)

It sounds like you should be asking how browsers protect users from malicious content. Perhaps you could elaborate?

koolba · on Aug 28, 2017

You can perform a denial of service attack on a naive server with a maliciously crafted PNG. Just send a zip bomb and see what happens when it decompressed it. The naive approach will crash the server when it tries to malloc successively larger buffers.

https://www.bamsoftware.com/hacks/deflate.html

rpearl · on Aug 28, 2017

They say right in the article that they're doing image resizes on the server, for instance. With a customized library too...Hope it's sandboxed well!

nametube · on Aug 28, 2017

Image and Video codecs come under attack quite often see https://blog.sucuri.net/2016/05/imagemagick-remote-command-e...

In this context the image manipulation they do with pillow and the underlying libjpeg would be a potential source of vulnerabilities.

acdha · on Aug 28, 2017

Significantly, it's not just libjpeg but every format supported by Pillow (http://pillow.readthedocs.io/en/3.4.x/handbook/image-file-fo...) — many of those vulnerabilities have historically been in obscure formats where the implementation has had far less attention than the mainline JPEG or PNG support.

mintplant · on Aug 29, 2017

Yep, I remember multiple smaller art-centric sites getting hit in a wave by an ImageMagick RCE vulnerability. Database dumps, full source leaks, the works. Unsure whether it was the one you linked; it seems more recent than I thought.

jasiek · on Aug 28, 2017

Here you go.

https://www.theregister.co.uk/2016/10/24/apple_security_upda...

coldcode · on Aug 28, 2017

I find it interesting that these sorts of stacks have tons of moving parts. Maybe it's the nature of highly scalable systems? Or does it come from starting with one particular technology and then having to drag in lots of other things to make it work?

dmitrymukhin · on Aug 28, 2017

In the article we tried to convey the main idea behind that — take the best tool for the job at hand. There's no "one size fits all" framework or product to put you money on. It's much easier to handle this zoo than making something do that it's not supposed to.

Furthermore, to get high scalability, you have to make things as loosely coupled as possible. This means you're up to making some choices.

Hope that makes sense and answers the question :)

otakucode · on Aug 30, 2017

It wouldn't need many moving parts if you didn't want to mess with the data. Like if you just functioned essentially as a blob store. But as soon as you start touching data formats... things get fun and you should prepare to be a hacker playground.

lukb · on Aug 28, 2017

That's a great read. I've always been interested in learning how such tech-oriented companies found their initial traction. Are there any blog posts / articles / podcasts about Uploadcare's early days and the search for the product/market fit?

notrheadagain · on Aug 28, 2017

I just got wind of it, we at Uploadcare will soon be releasing an article with more info about the early days :) And, I believe, a podcast or two. Thanks for this question, btw. Would you elaborate on what you would like to know? It'll help us compile a great article, thanks :)

lukb · on Aug 28, 2017

Great to hear that!

The reason why it's particularly interesting to me, i.e. to someone with a dev background, is that the lean startup wisdom says you should be very specific about the customer you're after and Uploadcare seems like a solution targetted at a broad spectrum of customer segments. Of course, I'm happy to be proved wrong if there are one or two dominant customer segments that you address Uploadcare to. Also, you might have as well started out with a very specific customer persona and spread to other segments. Whatever it was, curious to know.

I guess many developers dream up products targetted at developers like them selves. Selling to fellow developers is hard. It would be great to read a success story for a change.

dmitrymukhin · on Aug 29, 2017

Short answer is:

- believing in your own service - perseverance - fanatical customer support

This helped us to stay focused and our happy customers brought us new happy customers. We didn't know much about marketing in the beginning.

copyconstruct · on Aug 28, 2017

I wonder what's the breakdown between unique files delivered as opposed to files delivered from the CDN cache. Also, what's the breakdown for file uploads, manipulation and delivery? The 350M API requests per day would make more sense if we get this brakdown

dmitrymukhin · on Aug 28, 2017

Cached/uncached file delivery is close to the universal 80/20 ratio. Cached operations are not included in that number.

Unfortunately, I can't say anything more than that.

copyconstruct · on Aug 28, 2017

Curious - does that mean you serve close to 1.75 billion requests per day, out of which 350M are unique requests that exercise your stack instead of being served from a CDN. It'd be interesting to know more about what's the number of transformations you do at peak, if you can talk about it.

dmitrygr · on Aug 28, 2017

350M/day = just about 4K QPS. Is that considered impressive nowadays?

siddhant · on Aug 28, 2017

It's important to define what the "Q" is.

4K QPS where Q = file uploads -> Definitely.

4K QPS where Q = HEAD request? -> Not so much.

tyingq · on Aug 28, 2017

Assuming most transactions are a largish file transfer it seems impressive. And I assume the transactions aren't evenly spaced, so the peak is likely much higher.

4k qps of DNS, for example, would be less interesting.

RX14 · on Aug 28, 2017

In the case of large responses, bandwidth out is a much more interesting metric. I'm sure their number sounds more impressive though.

rak00n · on Aug 28, 2017

If it's a natural distribution the peak QPS will not be 4k QPS.

cagenut · on Aug 28, 2017

Impressive or not I just wish people would stop using monthly averages in the headline like this. You can't really make the case that this in-depth stack dive is for a layman audience, so you have to know what a meaningless metric it is.

dmitrymukhin · on Aug 28, 2017

I totally agree, but this was one of the requirements of the editor to have a "good marketable" headline. And we have to admit that this worked quite well.

On the other hand I feel that the article is very light and is indeed more for layman audience :) In depth one would be 10-15 times longer (and 100 times harder to write).

z3t4 · on Aug 28, 2017

The QPS is just a number, that doesn't say much. Impressive or not, it's still interesting to read stories like these. My nit pick is that they, like many others are only using one cloud provider, don't put all your eggs in the same basket.

hlieberman · on Aug 29, 2017

It looks like your certificate expired.

Since it's just a DV certificate from Comodo, have you considered switching to Let's Encrypt? Its automated systems could have helped you automatically update.

yonasb · on Aug 29, 2017

Yeah, totally on me for letting this expire. Sorry everyone :/ Lesson learned. We may switch to Let's Encrypt once they add wildcard support, which seems to be next year.

RKearney · on Aug 29, 2017

It's a wildcard certificate, which isn't supported by Let's Encrypt yet.

powvans · on Aug 29, 2017

Site appears to be hosted in AWS. Lots of free SSL goodness to be had from Amazon as well.

King-Aaron · on Aug 29, 2017

It's stretching into off-topic land here, but could you suggest any AWS SSL resources that are worth investigating?

manigandham · on Aug 29, 2017

AWS has Certificate Manager which provisions free certificates and manages renewals automatically. Usable across ELB, Cloudfront, etc.

https://aws.amazon.com/certificate-manager/

dmitrymukhin · on Aug 29, 2017

It's Heroku for Stackshare: https://stackshare.io/stackshare/stackshare

Uploadcare does not have issues with certificates and we're indeed going to switch to ACM for some of the endpoints.

sigi45 · on Aug 28, 2017

By using AWS.

dmitrymukhin · on Aug 28, 2017

It's not that hard to get high numbers with AWS, indeed :p

The hard thing is to make it cost effective. To that end I can proudly say that AWS bill is not in the top list of Uploadcare expenses.

notrheadagain · on Aug 28, 2017

AWS is an infrastructure. It could be Google Cloud or something else. When writing the article, we wanted to convey how frameworks are interconnected and try and estimate the number of "Moving Parts" (I liked this one) :)

drchaim · on Aug 28, 2017

All I can say is they don't seem to donate Django project.

notrheadagain · on Aug 28, 2017

Uploadcare contributes to open source, however. For instance, the fast and production-ready Pillow-SIMD fork is a great example. You can search for "Pillow-SIMD" here for the discussion or check out the original article: https://blog.uploadcare.com/the-fastest-production-ready-ima...

orf · on Aug 28, 2017

If that's all you can say, don't say it at all.

dmitrymukhin · on Aug 28, 2017

Oh the irony.