With Unicode being a moving target I'm not sure any language will truly "fix it all the way": building in things like grapheme-cluster breaking/counting to the language just means the language drifts in and out of "correctness" as the rules or just definitions of new or existing characters change. Of course, this is covered in the article, but when you "clean up" everything such that the language hides the complexity away you can still have people bitten (say, by not realizing a system/library/language update might suddenly change the "length" of a stored string somewhere). Or you could simply have issues because developers aren't totally familiar with what the language considers a "character," as there's essentially no agreement whatsoever across languages on that front (Perl 6 itself listing the grapheme-cluster-based counting as a potential "trap" and noting that the behavior differs if running on the JVM.) I don't think a "get out of jail free card" for Unicode handling is really possible.
The codepoint-based string representation used by Python 3 may be "the worst" (I'm not totally sure I agree) but it's fine. The article's main beef is about the somewhat complex nature of the internal storage and the obfuscation of the underlying lengths.
The codepoint-based string representation used by Python 3 may be "the worst" (I'm not totally sure I agree) but it's fine. The article's main beef is about the somewhat complex nature of the internal storage and the obfuscation of the underlying lengths.