Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> For instance, why not use whole words as tokens?

Word-only tokenizers what people did in the RNN/LSTM days. There's no functional improvement over tokenization schemes like BPE or even WordPiece/SentencePiece, and it results in worse quality since you can't use meaningful semantic hints such as punctuation.



You can encode semantic hints in the layers instead. Admittedly, this is more expensive which is kind to counter of the word-as-tokens idea.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: