Hacker News new | past | comments | ask | show | jobs | submit login

This is interesting for a lot of reasons. It’s almost like a compression scheme for text as opposed to an image. The question is, is it lossless or lossy? What is the Shannon entropy for this set of symbols?



This doesn't compress the text any more than reducing font size does. The Shannon entropy is about the frequency with which each symbol occurs, not about what the symbols look like. This is just an alternate font and does not actually change symbol frequency, so the Shannon entropy will be the same as any other font.

It's lossless unless you consider the possibility of symbol confusion, as it's pretty easy to produce strings that appear identical but are different, like how O and 0 are easily confused in many fonts.


It's definitely lossless (it encodes leter-for-letter). However, it has much higher entropy than the latin alphabet.


> higher entropy

How so? The entropy looks (dangerously) low to me.


High entropy = very low margin for error.

"aaaaaaaaaaaaaaaaaaaaaaaa" is low entropy, "JkjgpUBn74AREExy" is high.


Well I guess I am more used to the physicist’s definition of entropy, which can be described as “the higher the entropy, the less we care about the details.”


Very possibly, but, as far as I know, in the information-theoretic sense (which is what we mean when we talk about alphabets and conflating one letter for another) it's "high entropy" = "low redundancy".




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: