Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Standard large transformers trained on corpora of multiple languages will generally perform next-word prediction in language A by using information that was only seen in training data in language B, demonstrating that they have managed to implicitly learn a capability for translation and/or multilingual perception.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: