Hacker News new | past | comments | ask | show | jobs | submit login
BlurHash: Algorithm to generate compact representation of an image placeholder (blurha.sh)
363 points by goranmoomin on Feb 20, 2020 | hide | past | favorite | 110 comments



For comparison, Instagram sends ~200 byte base64 thumbnails in its API responses. It's a low-res JPEG with a fixed header to save a bit more space. This is also used to blur sensitive images. You can get even better results with webp.

It requires minimal javascript support-- just enough to build an img tag with the appropriate data url, and renders just as fast as any other tiny image. Upscaling blurs away the imperfections.

https://stackoverflow.com/q/49625771


Yes, this was inspired by that, and the fact that our database engineer went "I'm not putting that much garbage in my database"!

And I figured, yeah, that's still a lot of data, and it's massively misusing several different tools. Why not create something that is actually designed for the task?


the fact that our database engineer went "I'm not putting that much garbage in my database"!

This is really disingenuous-- a DBA should not be a gatekeeper of what gets put into a database. Storing 200 bytes of extra data per row costs you absolutely nothing. Even if you assume 100 million photos, that's 20GB of data. That's a little over $2 of EBS cost per month. Who cares.


> a DBA should not be a gatekeeper of what gets put into a database. Storing 200 bytes of extra data per row costs you absolutely nothing.

To be fair, isn't it the DBA's job to gatekeep the database? The application programmers and business managers don't have the in-depth understanding to know how every individual change will effect the database. Of course they should follow best practices, but it's not really their job to know it inside and out - that's what the DBA does.

You don't know that 200 bytes per row would cost nothing - that's the DBA's job to understand it's cost. And we aren't talking about the amount of data, but without an in-depth understanding of what tables they were adding it to, how they're indexed, cached, etc, then you don't know the real cost. That 200 bytes might mean less can be stored in memory, if this is a frequently fetched table then that could be bad.

That's not to say the DBA made the right call, but that we don't have enough information to know if it was.


Unless you have some clever storage mechanism (such as a column-oriented database or a separate table with a join), 200 extra bytes directly in each row of a frequently used table makes that table less efficiently use disk and memory caches. Consider if the entire rest of that row takes up 100 bytes, and now it takes up 300. You now can fit a third as many rows in each disk block or memory page.


This discussion reminds me of when I read in our database server's documentation something along the lines of "for very wide tables (more than 10 columns)" and I literally had to laugh out loud, as the main table at work is over 500 columns wide...

Gotta love organic db schemas with long history...


> the main table at work is over 500 columns wide...

This is horrifying, but at least not as horrifying as the public sector database I once had to work with that predated proper support for foreign keys, so there was a relationship table that was about 4 columns wide and must have had more rows than the rest of the database combined.

Even the database they had moved this schema to struggled with that many join operations.


This is so true. Fitting rows in the page was a big win for query times in my case.


Hence columnar stores.

But parent is right that the choice of what shouldn't be coming from your DBA; they should be more interested in how.


It is the DBA's job to advise, and that includes advice like "this isn't a good idea", or "could you do something else instead", or "this will require XYZ rearchitecting, which will affect these other factors/queries/etc". The DBA isn't the final decision-maker, but they're a critical advisor.


Yes of course, but it's the business need that should drive that discussion, not the DB architecture. "I'm not putting X in my database" is very different than "Well, if we need to do Y, I think we'll have to change A, B & C". Up to others to decided if it is worth it or not.


It's also completely reasonable to say "We shouldn't do Y". The response to everything shouldn't be to go immediately into solution space and list off trade-offs on the assumption that Y is happening. With people you trust for both their integrity and their expertise, "we shouldn't do Y" is a reasonable shorthand for "there are a pile of valid reasons we shouldn't do Y, and you and I both know I can list them for you off the top of my head, and it might well be possible anyway but it would take a lot more work and trade-offs than you might expect".

That isn't the end of the conversation; if the response to "we shouldn't do Y" is "it'll be a pain if we can't do Y", then you can explore solution space together, where that solution space includes both trade-offs to support doing Y and alternatives to doing Y. Use experts for their expertise, rather than the equivalent of "I don't care, make it happen".

Or, in short: yes, of course the DBA doesn't unilaterally dictate the architecture, but maybe listen to their expertise anyway.


As far as I can tell, we are agreeing.


You should also consider that that was a loose, humorous description of what happened, and not something really worth analysing to the last letter.


Fair. I have multiple times run into that attitude seriously, otoh.


It is still my very firm belief that the primary motivation for the use of ORMs is backlash against gatekeeping DBAs. People keep looking for (literal) tools that keep them from having to deal with a (figurative) tool who won't let them use the damned database that belongs to the business to support business initiatives.

[edit: and that almost all network traffic is now L7 routed because of gatekeeping firewall administrators in the late 90's, so everything had to go over :80 or :443]


But that tool is also paid by the business to ensure that the database stays up and performant.

Seems kinda rude to ignore the motivations and incentives of the people who are responsible when problems happen.

Yes devs and infrastructure are often at odds professionally by design. Making it personal ignores the reason that different parts of the app have different owners that must agree to make changes.

“Those damn developers won’t just let me push my hotfix to master and make me have to do a code review when our severs are getting hammered.”


This only indicates that you have worked with a better class of DBA than I have.

"Why are we creating new table and foreign key relationships for this single bit of data?"

"Oh, because we aren't allowed to add any columns to <the appropriate table>"

"Wait. None?"

"None."

(Only the last conversation I had of this sort involved MySQL, which was notoriously bad at adding columns to live tables)

The cost of 5 extra joins for all business activities is going to outweigh whatever you think you have going on with that original table.

I think in some ways the situation was improved by expanding developer responsibilities to include these tasks rather than having a dedicated role. When you have to deal with the consequences of your own decisions instead of someone else being responsible, some activities won't be done at all, while others will go more smoothly. Without the database being personified, it's more of a doing something for somebody (else) instead of doing it to somebody.


> disingenuous

Don't think that's the correct adjective here.


> It's a low-res JPEG with a fixed header to save a bit more space. This is also used to blur sensitive images. You can get even better results with webp.

That is not my experience at all. On images less than 100kb, webp adds an overhead, and doesn't reduce the file size at all. A fixed header is also harder to achieve.


Can you clarify the significance of the fixed header?


JPEGs were never meant to be <1KB, so the header can be rather large-- looking at one example, the truncated JPEG data Instagram sends is 248B, and the additional fixed header information is 607B, resulting in an 848B thumbnail (the truncated data includes dimensions).

That 607B of fixed header includes some compression settings that don't need to change between images:

- 132B of quantization tables, saying how much to divide every coefficient by-- for subtle details the difference between "128" and "129" doesn't matter, and you can just send "2 * 64".

- a 418B Huffman table, which stores the frequencies of different coefficient values after quantization-- small values are much more common than large ones, given the divisors. You can have smaller image data with properly tuned tables, but that only matters for large images. For 250B of image data, it's a waste.

- Miscellaneous other details that are fixed, like the file type magic headers, the color space information, and start/end markers.

You can dig in more by putting a thumbnail like https://i.imgur.com/M0EsJH1.jpg into this site: https://cyber.meme.tips/jpdump/


if the header never changes, it doesn't have to be stored or transmitted. the client-side code can simply prepend the header after getting the data.


Hello, I am the main author of this. Also make sure to check out the README on Github for more technical details! https://github.com/woltapp/blurhash


Very nice work. Appreciate you open sourcing this


On the demo site (https://blurha.sh), if you go down to Components (where there's two input fields, separated by an X) if you type '99' and '99' the website becomes noticeably unresponsive.

Besides this very small bug, very useful piece of technology!


Judging by the example string on the website, `LEHV6nWB2yk8pyoJadR*.7kCMdnj`[0], you cannot use these hash strings as filenames because they can have disallowed characters.

You could save a step by just sending which image to download and use the filename as the hash you use to render the result, but this algorithm requires you store the relationship between the hash and the image somewhere (unless I am missing something obvious).

[0] source code: https://github.com/woltapp/blurhash-python/blob/7469c813ea64...


1. Using the hashes as principal indices seems unsafe in general (which includes image upload) anyway: it looks pretty simple to generate collisions. The suggested use case (save the hash in the database next to the original image or a reference to it) sounds fine.

2. Which disallowed characters? The dictionary appears to be "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz#$%*+,-.:;=?@[]^_{|}~". No slashes or null bytes, which are the common disallowed characters on server platforms. It does have characters your shell cares about, but that's only a problem if you're not quoting.


`:` is disallowed on MacOS and several of those characters are disallowed on Windows. Generally I view images on a frontend, not on my servers.


Fair! I would not expect the image be saved to a Windows path when displaying a blurred image in the example use cases (mobile apps and web pages). You're thinking Electron, perhaps?

Point of order though: I don't think that's right for MacOS, though you're certainly right for Windows. HFS+ lets you use any Unicode symbol including NUL (because the filenames are encoded Pascal-style). I don't know the details how how APFS does it, but it also appears to support any Unicode code point, though it additionally mandates UTF-8. [0]

[0] https://developer.apple.com/library/archive/documentation/Fi... grep "filenames"


Interesting, I was not aware about MacOS. If I run `touch :.:` it will run fine and create the file, but Finder disallows this for whatever reason:

    The name “:.:” can’t be used.
    Try using a name with fewer characters, or with no punctuation marks.


Finder exchanges : and / in filenames, for compatibility with classic MacOS, which used : as a directory separator and allowed / in filenames. You can create a file named / in Finder, and it will show up as : on the file system.


A file name with an embedded NUL character couldn't be passed to fopen(), or to any other function that takes a pointer to a C-style string. (Not directly relevant here.)


: was disallowed on classic MacOS. On modern macOS, it is allowed in the POSIX layer just fine. However, Finder and other parts of the UI will translate a : in the file name to a / on display, and vice versa.

This is because classic MacOS used : as a directory separator instead of /, and this behaviour is preserved in the UI, but obviously not in the underlying layers.


Is it not present in the underlying layer anymore? Applescript displays paths with : separators unless you specifically tell it you want a “POSIX path”. (This is actually really annoying.)


No, the underlying layer is POSIXy, and uses /. The highest UI layers do the mapping between : and /, and apparently also AppleScript.


Simple solution: run a POSIX-compliant OS on the frontend. All Linux distributions, all BSD distributions, modern macOS, and Windows Subsystem for Linux all support ':' in filenames.


':' works fine in MacOS file names for me.


They mention it's a custom version of base83: https://github.com/woltapp/blurhash/blob/master/Algorithm.md...

With this charset: 0123456789ABCDEFGHIJKLMNOPQ RSTUVWXYZabcdefghijklmnopqr stuvwxyz#$%*+,-.:;=@[]^_{|}~.

I imagine you could tweak that to replace problematic chars with safe substitutes.


These are the safest I could find. They are perfectly fine for using in filenames.


Windows "A file name can't contain any of the following characters":

    \ / : * ? " < > |

which means "*:|" would need to be removed if you want to support that use case


That depends very much on the platform ...


Well, Windows is very picky, yes.


> you cannot use these hash strings as filenames because they can have disallowed characters.

You can't really use them as filenames because they won't be collision free, too. They're not meant to be an identifier, but instead a compact representation of the image that can be stored in a database.


This is a tempting idea, but as was already pointed out, the problem is not the character set, but that similar images will encode to the same string.

Anyway, chances are you are already storing a filename in your database, so you would need a second field for the blurhash string.


  assets/example.png/LEHV6nWB2yk8pyoJadR*.7kCMdnj
  assets/example.jpg/LEHV6nWB2yk8pyoJadR*.7kCMdnj


s/g\//g#/

    assets/example.png#LEHV6nWB2yk8pyoJadR*.7kCMdnj
    assets/example.jpg#LEHV6nWB2yk8pyoJadR*.7kCMdnj
No need to send the extra data to the server for each request.


Hm, that's an interesting choice. I also wonder why they didn't use an encoding that has an alphabet with 2^n characters (that way you can directly map 1 charater to n bits when decoding).

Coincidentally, I've just finished some work on a project[1] that is in the same space (identifiers for images).

For the reasons you pointed out, I found Douglas Crockford's base32[2] encoding to be a good fit.

[1] https://github.com/pablosichert/ciid

[2] https://www.crockford.com/base32.html


That choice is explained in the README: https://github.com/woltapp/blurhash

(83 is about as many safe characters as you can reasonably find, and it allows some nice ways of packing values together.)


Could you explain what "AC components" refers to? I couldn't figure that out just by reading your README.

I wonder how the efficiency compares to just encoding on the bit level.


It is a term often used for DCT-transformed data. DCT, in this case, breaks the image down into basically an average colour of the whole image, referred to as the DC component, and a bunch of waves that make up the detail of the image, referred to as AC components.

https://github.com/woltapp/blurhash/blob/master/Algorithm.md


> Could you explain what "AC components" refers to?

All but the first component of the Fourier transform. (The first component is the average of the data.) The term comes from electrical engineering, but Fourier transform has lots of applications also outside of electrical engineering.


I quite like that idea. Maybe you could run base64/base32 on the blurhash. That brings two potential issues I could think of. Is the size of the result less than filesystem max file name length? And what about collisions where two slightly different images have the same blurhash?


As I mentioned upthread: none of those are disallowed on common server platforms, but ironically canonical Base64 _will actually_ break, and it'll break rarely enough that you're likely to miss it in testing! (The last character in canonical Base64, for symbol 63b10, is /, which is disallowed on (*nix) filesystems and does not occur in the native BlurHash dictionary. Unless you're suggesting encoding the encoded format, I guess.)


I accidentally made a folder (in my pinephone's / folder) the other day called '', and I was about to delete it with rm -rf .

Maybe that shouldn't be allowed, but it is :P


If you want to do that just use a different encoding which outputs a string safe to be used for filenames.


This one is picked very carefully to be as safe as reasonably possible with this number of characters. As was pointed out elsewhere, it is actually safe for filenames.


This looks great. For browsers, would have been even better if the result was an SVG or pure CSS gradient. Canvas can still have performance hit on mobiles.

The JS implementation of Gradify [1] did something similar but by using CSS linear-gradient which is even better in terms of performance. I wonder if the same could be implemented here.

[1] https://github.com/fraser-hemp/gradify


You don't have to keep it displayed on canvas given it isn't "animated". You can use temporary canvases for the initial render then convert it to a Blob (and from a Blob to an Object URL). Most browsers at that point use the GPU's transforms to quickly convert the image data to a basic compressed JPG or PNG and then the usual browser performance mechanics of an image apply (and as a URL you can use it in regular IMG tags or CSS things like background-image). Presuming you remember to revoke the Object URLs when you are done with the hashes to avoid Blob leaks, performance is improved over the "simpler" canvas techniques that just display the rendered canvases.

If code helps, here's how it looks as a React Hook in Typescript:

https://gist.github.com/WorldMaker/a3cbe0059acd827edee568198...

(I offered the code to the react-blurhash repository in its Issues.)


The cost is in the initial render to a canvas. Once on a canvas, there is no advantage to converting it to a blob for display in an IMG tag as far as performance goes.


That's not what my testing has shown in practice. Memory use drops dramatically in the conversion from canvas to blob when the browser compresses the render buffer to JPG/PNG under the hood. GPU usage seems to drop in the browsers I had under performance tools, but it subtle enough it could be just margin of error (as memory pressure was my bigger issue, that wasn't something I was keeping that good of an eye on).

But yes the biggest performance gains aren't in pure, static IMG tag usage scenarios, the biggest performance gains I saw were in combo with CSS animations, and that was something important to what I was studying. As Blurhases are useful loading states this seems a common use case to me of having a Blurhash shown/involved in things like navigation transitions, and it seems pretty clear browsers have a lot more tricks for optimizing static images involved in CSS Animations than they do for canvas surfaces.

I was looking for a way to better amortize or skip the initial render as well. It should be possible to take the TypedArray `decode` buffer and directly construct a Blob from it, but I couldn't find a MIME Type that matched the Bitmap format Blurhash produces (and Canvas setImageData reads) in the time I've had to poke at the project so far. As I mentioned, memory pressure was a performance concern in my testing, so I'd also be curious about the performance trade-off of paying for an initial canvas render and getting what seems to be a "nearly free" compression to JPG or PNG from the GPU in converting that to blob, versus using a larger bitmap blob but no canvas render step.


The gradify example site[1] has been taken over by a weight loss promotion. Here's a fork of gradify[2] with examples.

[1] http://gradifycss.com/ [2] https://github.com/QueraTeam/gradify


You can decode using a very small canvas to minimise performance issues, and the scale the resulting image up.


Forgot to mention one important detail: How big is the JavaScript source to the BlurHash decoding algorithm?

I'm guessing that it's approximately the size of a single image, possibly quite a bit more.


1.9kB, minified and gzipped. Well within the parameters for cost savings.

https://bundlephobia.com/result?p=blurhash@1.1.3


The algorithm is tiny, and will easily fit in less than an image's worth of data, especially if you minify it. Also, it is likely you have more than one image on any given website.


This is a great point! I wonder if it could be made much smaller by instead "vectorizing" the blur. Right now it appears to draw individual pixels, but perhaps it could draw a few large blurred gradiented circles instead?

(I also appreciate that this is probably me getting overexcited about this, since odds are the JS is cached, and in the case of mobile apps it doesn't matter much.)


There's a pretty awesome project that does this with SVGs.

https://pkg.go.dev/mod/github.com/fogleman/primitive https://primitive.lol/

I've used it in a project to generate vector thumbnails that are under 1KB that I can just base64 encode and send along with the general text of a page - so the page renders with the text and low-fi images, and then I can load the hi-fi ones as necessary.

The lo-fi images are art in themselves, sometimes I preferred them to the actual images :-P


The image animation (with randomized recreations) gives an amazing effect. I've never seen this technique before.

Like cinemagraphs, but unique and alive in a nice way.

http://i.imgur.com/Cb4ecUC.gifv


That looks great!

You could probably devise a much smaller data representation too, to get the size down.


Yeah, most of the size is the SVG XML cruft. The encoder uses only one kind of svg element with a standard set of attributes, so yeah, absolutely.


22kb

Edit: Nevermind, derp, https://blurha.sh/blurhash.01ae00eea611ada5e9c7.js contains all of the JS of the entire website.


Ironically jpegoptim is pretty happy to knock img1.jpg (woman eating donut) down from its original 55kB to 33kB with no perceptible-to-me quality loss though ;)


It is a lot smaller than that, that is probably the size of all JS code on the web page?

The algorithm itself fits into a few k.


Yeah, you're right. I found the first JS file in my network tab and judged by its blurhash.js filename that it was the algorithm: https://blurha.sh/blurhash.01ae00eea611ada5e9c7.js

Of course, my first clue should have been that it was also the only Javascript file.


I implemented something similar when I was playing around on a personal page. It's built by Jekyll and inlines a base64 version of the jpeg with a CSS blur filter, then lazyloads the real image and transitions it in via opacity 0->1. You can see it in action here: https://www.road-beers.com/


I noticed that it only inlines the base64 for the first pictures/"above the fold", and then loads the rest async. Clever.


Cool!


The examples discussed seem to assume a dynamic site where you load the images (and the blurha placeholder) from a database, but how easy would this be to use for a static site with large images, perhaps embedding the blurha directly in the HTML?


You could put the hash in a custom HTML attribute, and have the JS load it from there.

<canvas width="100px" height="100px" imgsrc="hirez.jpg" blurhash="abcd..." />


Should work fine, if you integrate it into your toolchain somehow.


I'm surprised this isn't more web focused, i.e. a NodeJS library which encodes an image as a BASE64 data URL which any browser can render immediately while the real image lazy loads


It's a lot more efficient to do it once, and store it.

Also, it's really simple! Go ahead and create a library for it! The algorithm is tiny, and easily ported, so it's quite fun.


100% agreed. A base64 encode would be a lot more browser friendly. Something like `background-image: url('data:image/jpeg;base64,....')` means you only need the encoder, the browser would decode without any js being needed (which would block initial page render)



Funny coincidence that I am working on an iOS app and was looking for something similar just couple days ago. I ended up building something but your swift solution seems much better than what I have so I will switch to yours. Thank you!


Nice!


How did this compare with progressive JPEG ?


I think the biggest difference is in visual effect. A progressive JPEG looks pretty terrible at first, whereas a blurred image may look preferable even though it technically has less information.


Agreed, +1 for browser makers to make progressive JPEG loading not look like a dog just barfed the original image


It'd be neat if there was an easy way to combine progressive JPEG loading with aggressive CSS blur filters, maybe. I think onload fires when the progressive JPEG is done, so perhaps you could have the onload JS disable the blur?


Do modern browsers even support progressive display of JPEGs? I have not seen it happen in years and years.


In the blink rendering engine, partial rendering of a progressive jpeg is pretty much the lowest priority task for CPU time. That means unless you have a very slow network and very fast CPU, you will never see half rendered progressive jpegs.

That's the right design decision, because rendering a progressive jpeg repeatedly is rather expensive - as well as redecoding the whole jpeg, you also need to re-render any text or effects overlaying the jpeg, and recomposite the frame. And then you're gonna have to do all that work again when a few more bytes of progressive jpeg arrive from the network...


I expect they do. It might just be that people aren't often encoding JPEGs such that they can take advantage of the support (or that your link is generally fast enough that you don't see an intermediate render of the data).


Yes.


If the full images are large, and you have a number of them on the page (a gallery age for instance) some will remain empty boxes until others are completely loaded, as the browser limits the number of concurrent connections to each host.

IIRC the limit is most commonly 6. So if you have 7 or more images the 7th will not even partially download until the 1st has downloaded in full.

With this method, the small blurred images will load before any of the larger ones so you should see them all relatively immediately, absolutely immediately if encoded in the HTTP stream as data rather than being retrieved with an extra HTTP request, before any of the larger images start to transfer.


In addition to the aesthetic effect, this is available immediately, while progressive JPEG will still require additional data to be loaded from the server.


It's cool, I have no use for it but it's cool.


The accompanying blog post is unfortunately unreadable with an ad blocker on that blocks third-party trackers from Google Analytics, Facebook, and New Relic. It hijacks scrolling and continually scrolls back to the top. (Mac Safari + 1Blocker + PiHole.)


Interesting bug on the demo page:

- Upload any image

- Now upload a transparent png

- It does not clear the previous image, layering them together

- It hashes the combination picture. Super neat!


any ideas how this would impact on SEO/Pagespeed? if you are doubling the amount of pictures being loaded, albeit a tiny placeholder for a thumbnail - not from a size perspective, but from a network perspective surely this would double the amount of image fetches


The whole idea here is that you do not need to fetch an extra image. The image is generated in code on the client side from the short string. So there is no network connection needed, and the placeholder is there instantly.


then there must be some computational cost for the client


Of course. The cost is fairly small, though, since you only need to generate very tiny thumbnails, and can then just scale them up.


What's the advantage of this over using a Gaussian?


The amount of data to send is miniscule.


> Your client then takes the string, and decodes it into an image that it shows while the real image is loading over the network.

That's not a "hash", that's an encoding. Hashes are one-way.


This /is/ one-way by virtue of throwing information away: you can't rebuild the original image. If you're thinking of preimage resistance (it must be hard to find an input that hashes to a given output) that's a property of cryptographic hashes, but not hashes in general (canonical example FNV).


A hash is any function that produces a fixed size output. It’s not required to be one-way, or to even change the input. You might be thinking of a “good cryptographic” hash. For that, it’s desirable to be one-way and to scramble the input, but that’s a specific subset of hash functions.

Edit: to the downvoter, please have a look through https://en.wikipedia.org/wiki/Hash_function


Can you show me then how you decode this string to the original image?




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: