Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> The core algorithm behind modern generative AI was developed specifically for translation

Indeed! And yet, generative AI systems wire it up as a lossy compression / predictive text model, which discreetly confabulates what it doesn't understand. Why not use a transformer-based model architecture actually designed for translation? I'd much rather the model take a best-guess (which might be useful, or might be nonsense, but will at least be conspicuous nonsense) than substitute a different (less-obviously nonsense) meaning entirely.

Bonus: purpose-built translation models are much smaller, can tractably be run on a CPU, and (since they require less data) can be built from corpora whose authors consented to this use. There's no compelling reason to throw an LLM at the problem, introducing multiple ethical issues and generally pissing off your audience, for a worse result.





> Why not use a transformer-based model architecture actually designed for translation?

Because translation requires a thorough understanding of the source material, essentially up to the level of AGI or close to it. Long-range context matters, short-range context matters, idioms, short-hand, speaker identity, etc... all matters.

Current LLMs do great at this, the older translation algorithms based on "mere" deep learning and/or fancy heuristics fail spectacularly in the most trivial scenarios, except when translating between closely related languages, such as most (but not all) European ones. Dutch to English: Great! Chinese to English: Unusable!

I've been testing modern LLMs on various translation tasks, and they're amazing at it.[1] I've never had any issues with hallucinations or whatever. If anything, I've seen LLMs outperform human translators in several common scenarios!

Don't assume humans don't make mistakes, or that "organic mistakes" are somehow superior or preferred.

[1] If you can't read both the source and destination language, you can gain some confidence by doing multiple runs with multiple frontier models and then having them cross-check each other. Similarly, you can round-trip from a language you do understand, or round-trip back to the source language and have an LLM (not necessarily the same one!) do the checking for you.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: