Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You’re right! I’m surprised I didn’t know that. It looks like it can also be UCS-2, going by the spec:

> A conforming implementation of this International standard shall interpret characters in conformance with the Unicode Standard, Version 3.0 or later and ISO/IEC 10646-1 with either UCS-2 or UTF-16 as the adopted encoding form, implementation level 3. If the adopted ISO/IEC 10646-1 subset is not otherwise specified, it is presumed to be the BMP subset, collection 300. If the adopted encoding form is not otherwise specified, it is presumed to be the UTF-16 encoding form.



USC-2 is an old version of UTF-16 that lacks support for surrogate pairs, which means that rare symbols and emoji don't work.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: