I did once have a file that had UTF8, Windows-1252, MARC8, and VT100 (really) all mixed up in it. I think the data had gone through multiple migrations between software in its past.
I had write to my own "clean this as well as possible" thing, and it did a good enough job.
I had write to my own "clean this as well as possible" thing, and it did a good enough job.