Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Of course most indo-european languages have declensions or at least conjugation. That includes English even if it is overly simplistic there.

CJK languages do not really have that, they don't even have conjugation. They have simple suffixes at best to mark a verb as being interrogative.

Words do exist in all CJK languages and are meaningful. Tokenizing by Hangul radicals doesn't really make sense if you're not going to tokenize by decomposed letter in romance languages too (e.g. hôtel would be h, o, ^, t, e, l).



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: