I don't think that's a pain. It's making explicit what should be explicit and the decoded string doesn't have an encoding attached (like in Ruby), it can't be in an unexpected format, it's always UTF-16. One can argue about weather UTF-16 is the best choice, but at least it's always that and always Unicode. No surprises.