Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

LLMs can count characters, but they need to dedicate a lot of tokens to the task. That is, they need a lot of tokens describing the task of counting, and in my experience that allows them to accurately count.


Source? LLMs have no “hidden tokens” they dedicate.

Or you mean — if the tokenizer was trained differently…


Not hidden tokens, actual tokens. Ask a LLM to guess the letter count like 20 times and often it will converge on the correct count. I suppose all those guesses provide enough "resolution" (for lack of a better term) that it can count the letters.


> often it will converge on the correct count

That's a pretty low bar for something like counting words.


That reminds of something I've wondered about for months: can you improve a LLM's performance by including a large amount of spaces at the end of your prompt?

Would the LLM "recognize" that these spaces are essentially a blank slate and use them to "store" extra semantic information and stuff?


but then it will either overfit or you need to train it on 20 times the amount of data ...


I'm taking about when using a LLM, which doesn't involve training and thus no overfitting.


for an llm to exhibit a verbal relationship between counting and tokens you have to train it on that. maybe you mean something like a plugin or extension but that's something else and has nothing to do with llms specifically.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: