BINGO. But because of Han unification I all of a sudden DO need to know the lang...

astrange · on March 17, 2015

In what situations do you need to do this, but don't need to show any other data (dates and times, localized UI, user timezone, culturally appropriate fonts, RTLness) that involves knowing the user's languages and locale?

This can happen if the user is intentionally reading mixed-language text or text not in their computer's UI language, of course. In that case different CJK languages also have different preferred fonts, so having language tagging or just guessing is pretty important.

com2kid · on March 17, 2015

> In what situations do you need to do this, but don't need to show any other data (dates and times, localized UI, user timezone, culturally appropriate fonts, RTLness) that involves knowing the user's languages and locale?

For drawing a given glyph, there is normally a lookup into a font table that involves solely the string of Unicode code points coming in.

Except if any characters in the CJK Unified Ideograph range. Then my function call suddenly has to jump out to read environment variables, which are hopefully setup correctly.

My code to do a lookup into a font file should not depend upon the users environment variables due to a space saving optimization made two decades ago.

astrange · on March 17, 2015

> For drawing a given glyph, there is normally a lookup into a font table that involves solely the string of Unicode code points coming in.

Why are you implementing OpenType? It's got working libraries already.

But if you are getting into that, glyphs in a font are stored by "glyph name", not necessarily by code point. There's a bunch more steps than that.

- Font substitution: Find fonts that cover every character in the text. The order of your search list depends on the language.

- Text layout and line breaking: for best results, you don't want to line break in the middle of a word, and you need to place punctuation on the correct side of right-to-left sentences. I think both of these need dictionaries.

- Choosing individual glyphs: it's complicated! http://ilovetypography.com/OpenType/opentype-features.html

You have to read the GSUB tables and do a bunch of expected features, like ligatures, automatic fractions, beginning of word special forms (see Zapfino), &c. This includes language specific glyphs, but fonts can also just choose glyphs with a random number generator.

- Drawing the glyph. Remember not to draw each one individually, or a translucent line of overlapping characters (like in Indian languages) will look bad.

Each glyph actually comes with a custom program to do the hinting! It's even more complicated: https://developer.apple.com/fonts/TrueType-Reference-Manual/...

Luckily I don't think it depends on much external state.

cplease · on March 17, 2015

Sorry, Han glyphs render the same in Chinese and Japanese.

Regarding simplified versus traditional, no one is seriously unifying those.

There's some minor disagreements as to when a minor stylistic or historical variant deserves a separate glyph, but this isn't about rendering different glyphs in Chinese or Japanese. If Unicode is doing its job no one should have difficulty reading unified Han characters in one font regardless of language.