Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Here's a frequency chart of the first 3 images:

(23, 'comma') (10, 'capital-i') (8, 'slash-slash') (7, 'divided-by') (7, 'capital-l') (6, 'ex') (5, 'slash-slash-backslash-backslash') (5, 'slash-slash-backslash') (5, 'equals') (3, 'plus') (3, 'minus-dot') (3, 'equivalent') (3, '11-over-1') (2, 'zee') (2, 'vertical-line') (2, 'three-peaks') (2, 'slash-backslash-backslash') (2, 'slash') (2, 's-tac-toe') (2, 'minus') (2, 'lower-j') (2, 'leaning-heart') (2, 'c-slash-slash') (1, 'y-slash-slash') (1, 'upsidedown-t') (1, 'u-bar') (1, 'three-horizontal-two-vertical') (1, 'squared-capital-n') (1, 'square-c') (1, 'slash-i') (1, 'seven') (1, 'script-s') (1, 'script-j') (1, 'plus-dot') (1, 'parallel-lines') (1, 'minus-lower-dot') (1, 'lower-d') (1, 'l-on-l') (1, 'l-in-l') (1, 'j') (1, 'gamma') (1, 'four') (1, 'equals-slash') (1, 'crap') (1, 'close-bracket') (1, 'capital-z') (1, 'capital-t') (1, 'capital-m') (1, 'capital-f') (1, 'capital-b') (1, 'capital-a') (1, 'c-omega') (1, 'backslash') (1, '1-slash-1')



I did some very rough frequency analysis using this last night, but didn't get very much from it.

The comma symbol is more frequent than any letter usually is in English, but given the small corpus that's not too telling. It could stand for an 'e', or the coded text could be lists and they're just commas.

Someone commented on the article that he suspects the 'divided by' symbol might stand for 'i' due to its placement, which agrees roughly with the position it gets in the frequency table. Someone else has suggested that the language being masked might not be english, which is an intriguing possibility.

The frequencies aren't flat, which seems to suggest it's either not a very good homophonic cipher (he just threw some odd replacments and codeword-symbols in there, basically still a substitution cipher) or it's a very good one (he consciously aimed at misleading symbol frequencies).

The rough nature of the writing (also discussed on the article) suggests that the code was probably memorised, and thus not the result of a very laborous method.


There are definitely more symbols then there are letters in the alphabet. So that could mean he has multiple ciphers or that some symbols are substitutes for words, possibly common phrases.


Though I was advocating this as a possibility, I should point out that he might include some actual punctuation in the cipher as well as the standard 26 characters, so it's still possible that it's just a substitution cipher.

Additionally, it's possible that some of these characters aren't characters at all, but common repetitions. I'm thinking particularly of those slashes and backslashes.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: