Hacker News new | past | comments | ask | show | jobs | submit login

There's someone that must be paying your invoices? Charge them extra for the extra work that needs to be done to sort things out.



??? The work to use one of the many encoding guessing tools https://github.com/Joungkyun/libchardet and then get it correct for almost every document?

You just look bad if you can't do what every other software is able to do. Charging for it takes that to another level.


Then you don't have to worry about it since you won't get the work in the future? Someone else, with this presumably correct software, will always be able to do it for less, faster, and at a higher quality.

That's how business works...

If such a business competitor doesn't exist, then yes charge extra, and actually do the work correctly.


Am I missing something here? The work is ingesting documents from uncontrolled sources that might not all be UTF-8 and handling them correctly. Using an encoding guessing tool is the means to do that. In practice since there are only a few widely-used encodings and they're not terribly ambiguous it means that everything just works and users happily use the software.

This isn't some theoretical thing, we do this at $dayjob right now not only guessing the encoding but the file-type as well so that we can make sense out of whatever garbage our users upload. Everything from malformed CSV exports form Excel to PDFs that are just JPEGs of scanned documents. It works, and it works well.

And of course it does, the files our users are handing to us work on their machines. They can open them up and look at them in whatever local software they produced them with, there's no excuse for us to be unable to do the same.


Then you don't need to worry about it either way?




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: