Hacker News new | past | comments | ask | show | jobs | submit login

Conversion between different character codes became much easier. (Å (U+00C5) is canonically the same as Å (U+212B) but the latter exist only for round-trip compatibility.)

Unicode defines normalization algorithms (is é its own character, or e with a modifier character?).

I can have a document which combines English, Russian, Arabic, and Chinese, and expect it to be readable and editable by many different tools.




> I can have a document which combines English, Russian, Arabic, and Chinese

English combines with top-down Chinese, and English combines with right-to-left Arabic, but top-down Chinese and right-to-left Arabic don't combine properly in the same document using Unicode -- the Arabic will be written bottom-up instead of top-down when embedded in the top-down Chinese.


Layout is indeed hard.

I meant something simpler, like: The word 'computer' in English is 计算者 [jì suàn zhě] in Chinese, Компьютер in Russian, and حاسوب in Arabic."

Try that without Unicode.

It's of course possible with TeX, and no doubt other solutions. Which is why I added "and expect it to be readable and editable by many different tools".

(As a real-world use case, look at Knuth's "The Art of Computer Programming" and see how he credits people using their full names, in their own written language.)


Do you know when you would use 计算着 over 电脑[dian nao]? 计算着 I guess more literally translates to "one who computes", whereas 电脑 translates to "electric brain" which is a way more fun image, but I have no idea how the usage varies.


I am not a native speaker and terrible at googling this, but I've once heard that 電腦 was originally the Taiwanese way of saying it and 計算機 the Chinese way, in the same way that 'program' is still called 程式 on one side of the strait and 程序 on the other, or 'internet' is either 網絡 or 網路. The names of movies and video games are also generally different (PRC, HK, TW).

計算機 is a calculator, not a computer, in Taiwan: https://tw.images.search.yahoo.com/search/images?p=計算機

Compare with Baidu in the PRC: http://image.baidu.com/search/index?tn=baiduimage&ie=utf-8&w...

Edit: Oops, I didn't even realise that the parent poster asked about 計算著. That one I've never seen.


计算着 [jisuanji] is the older usage that harkens back to the days when computers were mainframes and terminals and not in every household. Generally everyone says 电脑 [diannao] these days.


计算机 jisuanji would be a hardware computer, 计算着 jisuanzhe computing as an ongoing action, and 计算者 jisuanzhe computer in the sense of one who computes


No clue. I did a copy and paste.


Isn't that not not a problem with Unicode, but the text rendering engine? Or is this indeed a spec bug?


Right-to-left script rendering is part of the Unicode spec, whereas top-down scripting for Chinese is a font issue.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: