Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There is a utf8mb4_unicode_ci collation, but we had big problems with it. I'm sorry, but this happened two years ago and I can't remember what the exact issues were. Whatever they were, they were bad enough that we ended up using utf8mb4_bin and sanitizing everything.

Edit: Apologies, I wrote utf8mb4_general_ci when I meant to say utf8mb4_bin. I think the issue we had was conflicting rows on unique constraints due to utf8mb4_unicode_ci having different comparison rules than utf8_unicode_ci.

Don't want to be confused by utf8 vs utf8mb4 charsets or deal with the subtle differences between utf8_general_ci, utf8_unicode_ci, utf8mb4_general_ci, utf8mb4_unicode_ci, and utf8mb4_bin? Switch to PostgreSQL. You will not regret it.



That sounds very much like FUD. There is nothing inherently magical about utf8mb4_unicode_ci.


A vague problem like this is really not cool. At the minimum your readers deserve that you update the article.

Say that there is a case insensitive version. If you want add that you had some problem but that you don't know if it had anything to do with it or not.

The problem you had was probably because of the difference between utf8mb4_general_ci and utf8mb4_unicode_ci.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: