If everyone chooses whatever encoding they like, then the charset being used has to be encoded somewhere. The problem is, there are lots of places where charset isn't encoded (such as your filesystem). That this is a problem can be missed, because almost all charsets are a strict superset of ASCII (UTF-{7,16} are the only such charsets to be found in the top 99.99% of usage), so it's only when you try your first non-ASCII characters that problems emerge.
Unicode has its share of issues, but at this point, Unicode is the standard for dealing with text, and all i18n-aware code is going to be built on Unicode internally. The only safe way to handle text that has even the remotest change of being i18n-aware is to work with charsets that support all of Unicode, and given its compatibility with ASCII, UTF-8 is the most reasonable one to pick.
If you want to insist on using KOI-8, or ISO-2022-JP, or ISO-8859-1, you're implicitly saying "fuck you" to 2/3 of the world's population since you can't support tasks as basic as "let me write my name" for them.
> If everyone chooses whatever encoding they like, then the charset being used has to be encoded somewhere.
This is gonna be the case for the foreseeable future, as you point out. Settling on one encoding only fixes this like, 100 years from now. I'd prefer to build encoding-aware software that solves this problem now.
> given its compatibility with ASCII, UTF-8 is the most reasonable one to pick
This only makes sense of your system is ASCII in the first place, and if you can't build encoding-aware software. I think we can both agree that's essentially legacy ASCII software, so you don't get to choose anything anyway. And any system that interacts with it should be encoding-aware and still validate the encoding anyway, as though it might be BIG5 or whatever. Assuming ASCII/UTF-8 is a bad idea, always and forever.
> If you want to insist on using KOI-8, or ISO-2022-JP, or ISO-8859-1, you're implicitly saying "fuck you" to 2/3 of the world's population since you can't support tasks as basic as "let me write my name" for them.
I'm not obligated to write software for every possible user at every point in time. It's perfectly acceptable for me to say, "I'm writing this program for my 1 friend who speaks Spanish" and have that be my requirements. But if I were to write software that had a hope of being broadly useful, UTF-8 everywhere doesn't get me there. I'd have to build it to be encoding-aware, and let my users configure the encoding(s) it uses.
> But if I were to write software that had a hope of being broadly useful, UTF-8 everywhere doesn't get me there.
Actually, it does.
Right now, in 2020, if you're writing a new programming language, you can insist that the input files must be valid UTF-8 or it's a compiler error. If you're writing a localization tool, you can insist that the localization files be valid UTF-8 or it's an error. Even if you're writing a compiler for an existing language (e.g., C), it would not be unreasonable to say that the source file must be valid UTF-8 or it's an error--and let those not using UTF-8 right now handle it by converting their source code to use UTF-8. And this has been the case for a decade or so.
That's the point of UTF-8 everywhere: if you don't have legacy concerns [someone actively using a non-ASCII, non-UTF-8 charset that you have to support], force UTF-8 and be done with it. And if you do have legacy concerns, try to push people to using UTF-8 anyways (e.g., default to UTF-8).
I can't insist that other systems send your program UTF-8, or that the users' OS use UTF-8 for filenames and file contents, or that data in databases uses UTF-8, or that the UTF-8 you might get is always valid. The end result of all these things you're raising is "you can't assume, you have to check always, UTF-8 everywhere buys you nothing". Even if we did somehow get there, you'd still have to validate it.
If everyone chooses whatever encoding they like, then the charset being used has to be encoded somewhere. The problem is, there are lots of places where charset isn't encoded (such as your filesystem). That this is a problem can be missed, because almost all charsets are a strict superset of ASCII (UTF-{7,16} are the only such charsets to be found in the top 99.99% of usage), so it's only when you try your first non-ASCII characters that problems emerge.
Unicode has its share of issues, but at this point, Unicode is the standard for dealing with text, and all i18n-aware code is going to be built on Unicode internally. The only safe way to handle text that has even the remotest change of being i18n-aware is to work with charsets that support all of Unicode, and given its compatibility with ASCII, UTF-8 is the most reasonable one to pick.
If you want to insist on using KOI-8, or ISO-2022-JP, or ISO-8859-1, you're implicitly saying "fuck you" to 2/3 of the world's population since you can't support tasks as basic as "let me write my name" for them.