Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I know absolutely no Japanese letters. And FWIW, I also know absolutely no Chinese/Mandarin (btw are they the same?), Korean etc.

I speak two languages — English and another language you probably haven't heard of that does not use English-looking letters (such as "a", "b" etc).

Anyway, provided I'm told that a given text is Japanese, I can tell if a letter is Kanji, Katakana, or Hiragana.

Which I think is commendable — if I may say so myself — given I didn't even know about the existence of "Kanji", "Katakana", or "Hiragana" until 3 years ago.



Mandarin is a spoken form of Chinese. So Mandarin and Cantonese, for example, share the same character set (Chinese). There's also traditional vs simplified Chinese, technically; they're similar with some differences in how they're written. Japanese also has some differences which can be confusing because a lot of application developers just use the Chinese character sets for kanji: https://heistak.github.io/your-code-displays-japanese-wrong/

One other thing about Chinese is that TV shows and movies usually have hard-coded subtitles while airing because the spoken versions of the language can be very different from one another. Including the text makes it more accessible to a wider audience regardless of what "dialect" is spoken in the show.


> So Mandarin and Cantonese, for example, share the same character set (Chinese).

This isn't really true; there are characters that are exclusive to one or the other, like 冇. The stronger political position of Mandarin means that its idiosyncratic characters are viewed as "real" while the idiosyncratic characters required by other languages aren't, but it's a fundamentally symmetric situation.

> One other thing about Chinese is that TV shows and movies usually have hard-coded subtitles while airing because the spoken versions of the language can be very different from one another.

The written versions of the language are also that different. The subtitles are in Mandarin, which everyone must learn to read.

(How common are hard-coded subtitles in modern Chinese media? They're on the older stuff, but it seems like a lot of modern shows don't bother.)


It's hard to acknowledge the massive cultural imperialism the PRC has engaged in with regards to Mandarin without getting "corrected" by people with an agenda. It's amazing how "fun facts" like "Everyone in China speaks a dialect of Chinese!" (they're vastly different languages, like how German and Italian aren't dialects of European) and "Everyone in China, no matter what dialect they speak, can pass notes!" (because everyone is forced to learn to read Mandarin no matter what language they actually speak) implicitly support that imperialism.


> they're vastly different languages, like how German and Italian aren't dialects of European

They're different, but not that different. If we believe Wikipedia ( https://en.wikipedia.org/wiki/Varieties_of_Chinese ):

> [Varieties of Min] form the only branch of Chinese that cannot be directly derived from Middle Chinese.

So the divergence between Mandarin and Cantonese [neither belonging to the Min branch] could be dated back maybe 1500 years. The divergence between German and Italian is much, much older than that.

English and Swedish would make a more apt comparison than German and Italian.


This might be true if the speakers were equally distant from each other and had an equal amount of contact with each other.

I've spoken Mandarin for over 20 years, studied French and Spanish for a few years each and also learned a bit of Cantonese and a bit of Taiwanese. In my subjective experience, French and Spanish are by far the closest of any two of those languages. Cantonese and Taiwanese would be the next closest and Mandarin is considerably further from either than they are from each other.


> In my subjective experience, French and Spanish are by far the closest of any two of those languages.

This is precisely what you would expect from the divergence times.

> Cantonese and Taiwanese would be the next closest and Mandarin is considerably further from either than they are from each other.

This isn't; Taiwanese is the outgroup to the more closely related pair of Mandarin/Cantonese.

It's always possible that learning "a bit of Cantonese and a bit of Taiwanese" doesn't give you a good grasp of what's going on in Cantonese and Taiwanese.


> I also know absolutely no Chinese/Mandarin (btw are they the same?)

If you see a bare reference to "Chinese", it will be referring to Mandarin.

There are several families of Chinese languages, with the most prominent being:

- 官话 (literally "mandarin speech"; Mandarin belongs to this family)

- 粤语 ("Yue language"; Cantonese belongs to this one)

- 吴语 ("Wu language"; Shanghainese belongs here)

- 闽南语 ("Southern Min language"; Hokkien belongs here)

It's also possible (and routine) to just label a variety of speech with the name of the place where it is spoken. This gives the names of specific dialects rather than broad families:

- 普通话 ("ordinary speech", this is Mandarin. It is conceived of as being an artificial standard rather than being specifically associated with any particular place.)

- 广东话 ("Cantonese", the speech of 广东, which is a large province)

- 上海话 ("Shanghainese", the speech of 上海, which while admittedly a large city is not as large as a province)

- 福建话 ("Hokkien", the speech of 福建, also a large province)

- 北京话 ("Beijingnese", again just a city. This construction applies at pretty much every scope.)

普通话 is the official name of Mandarin, but there are some important synonyms:

- 汉语 ("Han language" - Han is the ethnic term for the Chinese people. As noted, synonymous with Mandarin.)

- 中文 ("Chinese" - here the term is taken from the name of China, 中国. Also synonymous with Mandarin.)


> Anyway, provided I'm told that a given text is Japanese, I can tell if a letter is Kanji, Katakana, or Hiragana.

Sometimes harder than you might think.

e.g. consider:

*ロ口

*タ夕

*かカ力

*りリ (edit: these are more similar in the monotype font in the HN input box than the rendered comment)


There are some lookalikes, though. 夕べ is easily confused for katakana タベ (tabe), but is actually kanji 夕 (yuu) + hiragana べ (be). Hence it is more commonly written 昨夜 using kanji only.


In practice though, you'll never confuse them because katakanas are only used for non-Japanese words. 昨夜 sounds a bit too formal if you're chatting with friends.


I mean, it depends. Fiction at least will sometimes use katakana for emphasis or to point out a word that sounds different, a bit like using all caps in english, except a little bit more common. I've certainly seen タベナ as an instruction to eat in a manga recently.


> another language you probably haven't heard of that does not use English-looking letters

Out of curiosity, which language is it? And which writing system are you talking about?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: