I learned hiragana and katakana by drilling RealKana [0], together with the Flash-based Drag-n-Drops [1] back when Flash was still a thing. It looks like the original author of [1] has non-Flash versions now [2].
RealKana is great for drilling recognition (kana -> pronunciation), and I learned the individual kana by loading one column at a time (e.g. ka/ki/ku/ke/ko). Drag-n-Drop is good for speed-testing recognition, but also for recall (pronunciation -> kana), since it's invertible: you can either choose a pronunciation and go find its kana, or choose a kana and find which pronunciation it goes to.
I'm not sure you can really "learn" these characters without drilling like this -- the symbols are arbitrary, so you have to memorize the association between glyph and reading by brute force.
Hiragana and katakana are just the very easy parts of written Japanese - it's merely like learning Greek or Cyrillic alphabets.
Personally I picked up a Hiragana/Katakana chart image online and put it as my wallpaper, so I could easily decode a word when given the chance (best with 2 screens). You can learn then over a few months like that without trying hard.
However, unlike western writing systems, there's also a third thing which is not an alphabet: Kanji. And this is a world of troubles, because Kanji is literally FUBAR. So much so that people talk about dropping it every now and then. Korea used to use Chinese characters too, they eventually (mostly) dropped them in favor of something simpler.
> So much so that people talk about dropping it every now and then.
They did consider it in the 20th century when Japan was modernizing and catching up with the Western world. Nowadays nobody takes you seriously if you propose abolishing it. The debate was settled in 1962-1966 [1] when the government officially gave up any attempt to remove kanji from the Japanese writing system. It’s a very much a fringe idea now.
[1]: See 国語国字問題 on Japanese Wikipedia. The full quote is 漢字仮名交じり文が審議の前提。漢字全廃は考えられない。
How would you read Japanese without kanji? The lack of a space character makes all hiragana sentences far less legible than sentences with kanji which act as word boundaries.
Japanese words are lengthy compared to English as well. Kanji packs more content into less space.
I agree that the existence of “rare kanji” is a bit strange, but 8,000 or so characters seems reasonable, especially with so many compound kanji.
>How would you read Japanese without kanji? The lack of a space character makes all hiragana sentences far less legible than sentences with kanji which act as word boundaries.
They could just start using spaces. They use the button anyway to cycle through the different kanji that they've typed out in kana or latin.
In latin script text we recognize words as "symbols" of the language. Words might be composed of characters, but you don't read them one character at a time. You see the start, the shape, and the end of word. Based on that you know what word it is and what it means. You can raed scrmabled text withuot isseus.
Japanese would gain the same kind of recognition if it were only written in hiragana and katakana. It already has that anyway. You don't scrutinize whether it is the actual correct kanji. If it looks close enough it's good enough.
I think a bigger potential problem would be homophones. Japanese words seem to like using the same sounds for a lot of different meanings. (I wonder if that's why their language has so many English loan words.)
I think it's definitely possible, but it's up to them. Having a harder language isn't necessarily a bad thing.
> I think it's definitely possible, but it's up to them.
I’ve been following public discussions about the Japanese language in Japan for nearly forty years, and I think it’s safe to say that there is currently no significant advocacy for dropping kanji or drastically curtailing their use. Kanji are just too deeply embedded in the culture, economy, and educational system and in people’s linguistic identity.
The Japanese Wikipedia has an article about the history of the movement to switch to romaji [1]; the movement had some momentum up until the 1930s but was suppressed by the government during the Second World War. It was taken up again by the postwar Allied occupation [2], but nothing came of it. One of the last Japanese groups still advocating for a switch to romaji dissolved earlier this year after more than a century in existence, as the remaining members were too old to continue [3].
I suspect younger people have it easier than ever. Once you get a computer or cell phone, you only need to know how to read kanji and slam that autocomplete button.
AFAIK on electronic devices, they use Hiragana and the autocomplete "translates" in Kanji. People here point out that they cannot drop Kanji because of homophones, so I guess that autocomplete turns really fast into "autocorrupt" if you don't know your Kanji.
But I think electronic dictionaries are a great relief.
I'm sure what I'm going to write are not new insights but, to add to your point, generally we expect to consume information much quicker through reading than through hearing. When reading, say English, we don't really sound through each word but recognize the shape as you say. More concretely, it's probably our brains' pattern matching of frequently occurring clusters/patterns of letters: "psy" in "psychology" for example, if you spelt that "phonetically" as "saikolojee" I doubt people would recognize that.
I don't know Korean, but I've been told that it has plenty of Chinese loanwords (or Chinese-derived character compound "words") like Japanese, yet because it is less homophonous than Japanese, Korean people nowadays are very able to recognize such words by the shape of them when written in Hangul.
As a native Chinese speaker, Cantonese to be precise, my opinion is probably biased but I find Japanese to be so homophonous (for Sino-Japanese words 漢語 but even for native words 和語 to a degree) that even if Japanese adopted a more "efficient" syllabary like Hangul, words would still be difficult to decipher, spaces included. Pitch accent only helps so much (and of course that isn't written down in kana), the typical examples being 紙 ("kamì": paper), 髪 ("kamì": hair) and 神 ("kàmi": god). I'm trying to indicate the pitch accent in an adhoc manner with the accented letters; note that according to my (native) Japanese dictionary, 紙 and 髪 have the same pitch accent. So I think, a more phonetic writing system for Japanese, that remains efficient for reading, would end up annotating words with semantic or etymological hints, like the idiosyncratic spellings of Latin and Greek root words in English.
Also, the texts we read are allowed to use more sophisticated words, more literary words, with more complicated sentence structures. So while phonetic spelling ought to work to represent ordinary speech, that's not necessarily the case for general written expression. In practice, I find 漢語 in Japanese speech to be surprisingly difficult to comprehend, in the sense that, words are often too indistinct for me to pick them up by ear if I have not heard them in speech before, even if I have seen them in text before and would know the characters already (including their readings 音読み in Japanese).
That's different for me as a Cantonese speaker where I can pick up new literary compound words if their constituent characters are ones I know from other words/compounds, even rather infrequent ones. I would say it's because the sound system of Cantonese has 6 tones, mapping nearly one-to-one with the tones + voiced/voiceless distinction from Middle Chinese, which in turn is a much more monosyllabic language where characters have quite distinct sound values compared to any of the modern Chinese languages.
Incidentally I have a harder time picking up new terms in Mandarin by ear where many characters that are distinct in Cantonese sound the same; and it's widely agreed that Mandarin Chinese has evolved more disyllabic words to compensate. Again, yes, even when I know the compounds already through writing, learning to pattern-match for them in hearing in real time, has been, unfortunately, for me at least, a different story.
(Edit: added below)
I believe also that both Chinese and Japanese got more homophonous because they could get away with it (i.e. people got lazy in pronouncing more intricate sounds) when there was 漢字 to distinguish characters/words when needed. So there's certainly a feedback loop in there. If for the 2 to 3 thousand years of history there was no writing in ideographic characters, there would have been evolutionary linguistic pressure against too many homophones in the languages.
> When reading, say English, we don't really sound through each word but recognize the shape as you say
Apparently, some people do and some people don't; and each category is surprised the other exists. I'd be curious to know if the "visual" category is more prevalent for ideogram-based writing systems.
> my opinion is probably biased but I find Japanese to be so homophonous
It's difficult to evaluate what "so" means here, but it seems to me that homophony is made a bigger problem in this thread than it really is.
For instance, in French, we have words that have many homophones. For instance there is "vert" (green) "ver" (worm) "verre" (glass) "vair" (a type of squirrel) "vers" (the "to" conjunction) "vers" (verse) "verrent","verre" (conjugations of a very rare verb that means "to pounce" - probably most French speakers don't even know it; if you use it in a sentence their speech recognition module will probably segfault).
There's also common homophones "père" (father) and "paire" (pair) and "pair" (peer as in P2P), "mère" (mother) and "mer" (sea), "serre" (greenhouse) and "serre" (talon) and "sert" (conjugation of serve), "je suis" ("I am") and "je suis" ("I follow").
Some of these homophones double as homographs, as you can see. They are all "strict" homophones BTW, as we don't have distinctions between short and long vowels like English has, for instance.
But both in written and spoken language, grammar and context usually disambiguate the meaning - if any. Based on that observation, it seems to me it wouldn't be more difficult to figure which "kami" it is than which "vers" it is (worms, conjunction or verse), unless the sentence is specifically designed for that purpose.
Amusingly, the homophones "vair" and "verre" led to a small quarrel in the 19th century about what were the shoes of Cinderella made of - fur or glass? People who want to show off sometimes bring that up, because "vair" is a rarely used word [1].
Interesting examples from that wicked language, thank you! (Studied French before but unfortunately lost interest.)
You're right about the context and grammar being usually sufficient to disambiguate. I thought about the examples 話す ("hanâsu"; to talk) and 離す ("hanâsu"; to separate) I gave earlier, and I don't remember ever confusing the two in speech dialogue. But it's probably that there are enough non-homophonous near-synonyms for these words in Japanese that would get used in practice, if we imagine contexts where a word could conceivably be confused with another homophone, e.g. 言う【いう】, 喋る【しゃべる】, 語る【かたる】 in the case of 話す【はなす】.
The above words are "native" Japanese words 和語. Definitely the problem of homophones is way less serious for those words than for Sino-Japanese vocabulary, 漢語.
I think Japanese is untypical in that it's a language with a limited repertoire of syllables adopting words from a language with a much richer system of sounds, Chinese, and trying to map the (compound) words character by character. By the way, that probably relates to why most learners find pronunciation of any variety of Chinese to be difficult; the language needs to make the necessary aural distinctions, including tone (famously), which are apparently subtle to non-native speakers.
There are countless two-syllable compound words in Chinese, a good portion being used in common speech, but of the ones adopted into Japanese, many of them turn into essentially two "syllables" also. (Actually, some modern terms are back-borrowings from Japanese coinages through writing, just so that I'm being fair and historically accurate in this comment.)
Of these terms there is at least a few, that in Japanese, would be confused in speech, that in practice the pronunciation gets mangled to disambiguate. Here are some examples:
私立 ("privately established (institution, organization, etc.)") versus
市立 ("municipally established") has the following disambiguation in common speech:
私立【しりつ】→【わたくしりつ】
市立【しりつ】→【いちりつ】
科学 ("science") versus 化学 ("chemistry") has the disambiguation:
化学【かがく】→【ばけがく】 (as if the word were 化け学 but it's never written that way)
Basically the native Japanese reading of a character 訓読み is substituted for
the Sino-Japanese reading 音読み in speech, even though the latter is the proper, original reading for that character in the word in question it occurs in.
For the reader's curiosity, there's also one case of pronunciation mangling that I know in Mandarin Chinese:
炎 ("inflammation") versus 癌 ("cancer"):
In Mandarin, we have 炎 yán, 癌 ái even though the sound value of 癌 ought to have been also yán, homophonous with 岩 according to the rules of sound change over the centuries. Contrast with the following:
Cantonese 炎 jim4, 癌 ngaam4, 岩 ngaam4
Japanese 炎 en エン 癌 gan ガン, 岩 gan ガン
Basically it's the consequence of packing too much information onto individual syllables/characters while Western languages would simply devise longer words (with more syllables) for more sophisticated/technical concepts.
You have a very good point (so I'm not sure why you were downvoted).
I currently read Japanese at an upper-intermediate level. For example, I can read most of Harry Potter in Japanese an only look up a word every page or two.
The biggest problem with switching Japanese to "all gana" is the homophones. Many similar words sound the same but have different nuanced meanings such as hear and listen -- 聞く and 聴く -- both pronounced "kiku". This happens so often, that I think dropping kanji would be a disaster.
An additional benefit to the kanji is that when you get to know enough of them, you can begin to guess the meaning of new words, similar to learning Latin roots and morphemes like "un-" or "re-".
Just wanted to add some color here. The various written forms for native words (和語) like 聞く and 聴く aren't really coming from different words, they were most very likely the same word originally. Only that because Japanese adopted Chinese-character writing, it started developing these stylistic distinctions.
Native Japanese people writing these words also don't always distinguish them properly, e.g. using 聞く generally even if the more specialized meaning of 聴く would apply.
This particular kind of distinction is called 異字同訓 (の漢字の使い分け) in Japanese.
I think a better example is 話す versus 離す which I just discovered yesterday also seem to have the same pitch accent!
From my understanding, there are some words that are usually written in Hiragana anyhow (which always confuses me when that word contains a は). I don't think it would hurt to replace usage of Kanji by introduction of visible word boundaries, where necessary.
Dropping kanji will probably make the language even harder, due to the huge amount of homophones. It won’t be impossible because clearly Japanese people can talk with each other despite the homophones (pitch accent might help a bit here, but I doubt it would be easier than just learning kanji.
In most cultures, spoken language is usually much simpler and down-to-earth than written language. Misunderstandings can be immediately clarified by asking the other person to repeat or rephrase. Complex structures in spoken language is usually reserved for formal speeches or presentations. It is not surprising that people easily tire listening to those.
In case of Japanese, the pitch accent is not recorded in writing, but in spoken language it is probably very helpful to recognize word structure and boundaries.
> Personally I picked up a Hiragana/Katakana chart image online and put it as my wallpaper, so I could easily decode a word when given the chance (best with 2 screens). You can learn then over a few months like that without trying hard
The first time I visited Japan I set my phone wallpaper to a Katakana chart image and whenever I was waiting around (riding a train/subway, etc) I'd sit and read words on advertising. By the end of the week I no longer needed the chart.
This was back when data was extremely expensive and pocket wifis weren't a thing yet so there were far fewer distractions, I guess that helped...
I'll give a couple of these a shot, but I also really have to recommend the spaced repetition Anki provides for drills. There are a lot of community created decks [0] for different aspects of Japanese. And for some reason, creating your own deck for topics or vocabulary really seems to help with memorization.
I can say that after just over three weeks I'm sounding out words. I think hours practiced are a better measure than weeks, but I don't know how many hours I've spent. (Even then sleep seems vital to processing the material, so you can't probably can't just count straight hours you've spent cramming.)
My study has involved using this deck [1]; writing out the hiragana and katakana charts several times; and using Tofugu [2][3] or YouTube for mnemonics, extra explanation, or stroke order.
For anybody that stumbles on this, I'll note the Anki deck I'm using didn't have audio for the hiragana and katakana combination character cards. Audio is extremely useful for these, so I generated it using the Hyper TTS Add-On[0].
Yes -- the dakuten and handakuten marks are very systematic modifiers of the readings of the underlying glyphs. The dakuten more or less voices the leading syllable (ka -> ga, ta -> da), and the handakuten only applies to H*-column kana and changes the sound to a P*. From my perspective, you only really need to learn the base kana and remember the rule for the modifying marks -- you're not learning entire new kana after that.
(There's also combinations like kyu = ki + small-yu, but that also just builds on the kana you already know, and behaves pretty logically.)
> Hiragana being far more useful to know starting out, if you had to pick one.
Before visiting Japan, I learned to read in both Hiragana and Katakana, but I didn't really know more than a dozen or so words in Japanese. While visiting Japan, I found Katakana to be a lot more useful, because it's commonly used and often is just English words converted to Japanese letters. I think all my Hiragana reading abilities were completely useless as I couldn't tell what I was reading.
> I think all my Hiragana reading abilities were completely useless as I couldn't tell what I was reading.
This is what many people don't realize when they wish they wouldn't have to learn Kanji or Hanzi. They make a lot of sense for languages with lots of homophones.
This sounds easy in theory, but so far only one language has succeeded in completely getting rid of the Chinese characters: Vietnamese. This transition was imposed by the French colonial administration however, to more easily spread European-style civilization by breaking their connection to their native culture. A Vietnamese-speaker would have to tell whether there are any issues with homophones nowadays.
In Korea, Hanja are still actively used for disambiguation of homophones in complex texts. Public debate was divided for a long time, and even though Hanja are slowly being phased out, it is a slow process. It's hard to tell for sure, but even in North Korea the process seems incomplete.
I can only tell you why I upvoted it: It looks like a useful learning tool, as different sense experiences are associated with each other. I don't find it poorly built, and I don't know enough Japanese to know what's incomplete.
Even F-droid has half a dozen apps to learn the writting system.
The problem with learning this, is that when you know katakana and hiragana perfectly, you are still in the same learning spot that you are today to learn finnish or euskera, and much worse than german or danish (they have things in common with english).
Agreed - but probably the tight coupling between the hiragana, the romanji and the audio may give people a bit of a casual revelation about Japanese symbols.
(They are not all anki, wanikani, flavor of the month SRS spammers like some of us)
As an N4 student of Japanese the origin story of Hiragana listed here was quite interesting. However the overall layout/design of the characters and page leaves much to be desired.
So many of these sites exist. I even made one as a hobby project for learning React. Mine is a bit more involved (learn mode with mnemonics, quiz mode with a timer and different fonts, character filtering and modified characters). Thinking about expanding on it if anyone would like to suggest ideas or fork it. I mostly made it to force me to learn Katakana and Hiragana.
I know absolutely no Japanese letters. And FWIW, I also know absolutely no Chinese/Mandarin (btw are they the same?), Korean etc.
I speak two languages — English and another language you probably haven't heard of that does not use English-looking letters (such as "a", "b" etc).
Anyway, provided I'm told that a given text is Japanese, I can tell if a letter is Kanji, Katakana, or Hiragana.
Which I think is commendable — if I may say so myself — given I didn't even know about the existence of "Kanji", "Katakana", or "Hiragana" until 3 years ago.
Mandarin is a spoken form of Chinese. So Mandarin and Cantonese, for example, share the same character set (Chinese). There's also traditional vs simplified Chinese, technically; they're similar with some differences in how they're written. Japanese also has some differences which can be confusing because a lot of application developers just use the Chinese character sets for kanji: https://heistak.github.io/your-code-displays-japanese-wrong/
One other thing about Chinese is that TV shows and movies usually have hard-coded subtitles while airing because the spoken versions of the language can be very different from one another. Including the text makes it more accessible to a wider audience regardless of what "dialect" is spoken in the show.
> So Mandarin and Cantonese, for example, share the same character set (Chinese).
This isn't really true; there are characters that are exclusive to one or the other, like 冇. The stronger political position of Mandarin means that its idiosyncratic characters are viewed as "real" while the idiosyncratic characters required by other languages aren't, but it's a fundamentally symmetric situation.
> One other thing about Chinese is that TV shows and movies usually have hard-coded subtitles while airing because the spoken versions of the language can be very different from one another.
The written versions of the language are also that different. The subtitles are in Mandarin, which everyone must learn to read.
(How common are hard-coded subtitles in modern Chinese media? They're on the older stuff, but it seems like a lot of modern shows don't bother.)
It's hard to acknowledge the massive cultural imperialism the PRC has engaged in with regards to Mandarin without getting "corrected" by people with an agenda. It's amazing how "fun facts" like "Everyone in China speaks a dialect of Chinese!" (they're vastly different languages, like how German and Italian aren't dialects of European) and "Everyone in China, no matter what dialect they speak, can pass notes!" (because everyone is forced to learn to read Mandarin no matter what language they actually speak) implicitly support that imperialism.
> [Varieties of Min] form the only branch of Chinese that cannot be directly derived from Middle Chinese.
So the divergence between Mandarin and Cantonese [neither belonging to the Min branch] could be dated back maybe 1500 years. The divergence between German and Italian is much, much older than that.
English and Swedish would make a more apt comparison than German and Italian.
This might be true if the speakers were equally distant from each other and had an equal amount of contact with each other.
I've spoken Mandarin for over 20 years, studied French and Spanish for a few years each and also learned a bit of Cantonese and a bit of Taiwanese. In my subjective experience, French and Spanish are by far the closest of any two of those languages. Cantonese and Taiwanese would be the next closest and Mandarin is considerably further from either than they are from each other.
> In my subjective experience, French and Spanish are by far the closest of any two of those languages.
This is precisely what you would expect from the divergence times.
> Cantonese and Taiwanese would be the next closest and Mandarin is considerably further from either than they are from each other.
This isn't; Taiwanese is the outgroup to the more closely related pair of Mandarin/Cantonese.
It's always possible that learning "a bit of Cantonese and a bit of Taiwanese" doesn't give you a good grasp of what's going on in Cantonese and Taiwanese.
> I also know absolutely no Chinese/Mandarin (btw are they the same?)
If you see a bare reference to "Chinese", it will be referring to Mandarin.
There are several families of Chinese languages, with the most prominent being:
- 官话 (literally "mandarin speech"; Mandarin belongs to this family)
- 粤语 ("Yue language"; Cantonese belongs to this one)
- 吴语 ("Wu language"; Shanghainese belongs here)
- 闽南语 ("Southern Min language"; Hokkien belongs here)
It's also possible (and routine) to just label a variety of speech with the name of the place where it is spoken. This gives the names of specific dialects rather than broad families:
- 普通话 ("ordinary speech", this is Mandarin. It is conceived of as being an artificial standard rather than being specifically associated with any particular place.)
- 广东话 ("Cantonese", the speech of 广东, which is a large province)
- 上海话 ("Shanghainese", the speech of 上海, which while admittedly a large city is not as large as a province)
- 福建话 ("Hokkien", the speech of 福建, also a large province)
- 北京话 ("Beijingnese", again just a city. This construction applies at pretty much every scope.)
普通话 is the official name of Mandarin, but there are some important synonyms:
- 汉语 ("Han language" - Han is the ethnic term for the Chinese people. As noted, synonymous with Mandarin.)
- 中文 ("Chinese" - here the term is taken from the name of China, 中国. Also synonymous with Mandarin.)
There are some lookalikes, though. 夕べ is easily confused for katakana タベ (tabe), but is actually kanji 夕 (yuu) + hiragana べ (be). Hence it is more commonly written 昨夜 using kanji only.
In practice though, you'll never confuse them because katakanas are only used for non-Japanese words. 昨夜 sounds a bit too formal if you're chatting with friends.
I mean, it depends. Fiction at least will sometimes use katakana for emphasis or to point out a word that sounds different, a bit like using all caps in english, except a little bit more common. I've certainly seen タベナ as an instruction to eat in a manga recently.
Layout breaks for "ko" (and some other sounds) on desktop browser. I was thinking of building similar tool, but using hover and probably doing some smart-ish preloading thing, or adding all sounds to one huge audio file and seeking them, to make it easier to be cached and work offline. A quick switch to Katakana shouldn't be hard to implement too.
RealKana is great for drilling recognition (kana -> pronunciation), and I learned the individual kana by loading one column at a time (e.g. ka/ki/ku/ke/ko). Drag-n-Drop is good for speed-testing recognition, but also for recall (pronunciation -> kana), since it's invertible: you can either choose a pronunciation and go find its kana, or choose a kana and find which pronunciation it goes to.
I'm not sure you can really "learn" these characters without drilling like this -- the symbols are arbitrary, so you have to memorize the association between glyph and reading by brute force.
[0] https://realkana.com/
[1] https://www.csus.edu/indiv/s/sheaa/projects/genki/hira_main....
[2] https://ohelo.github.io/usagi-chan/katakana/