> For instance, why not use whole words as tokens? Word-only tokenizers what peo... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		minimaxir 8 months ago \| parent \| context \| favorite \| on: Dummy's Guide to Modern LLM Sampling > For instance, why not use whole words as tokens? Word-only tokenizers what people did in the RNN/LSTM days. There's no functional improvement over tokenization schemes like BPE or even WordPiece/SentencePiece, and it results in worse quality since you can't use meaningful semantic hints such as punctuation.

neuroelectron 8 months ago [–]

You can encode semantic hints in the layers instead. Admittedly, this is more expensive which is kind to counter of the word-as-tokens idea.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact