So people that can't read or write have no language? If you don't know an alphab...

tripzilch · 2025-08-11T12:53:23 1754916803

So first of, people who _can't_ read or write have a certain disability (blindness or developmental, etc). That's not a reasonable comparison for LLMs/AI (especially since text is the main modality of an LLM).

I'm assuming you meant to ask about people who haven't _learned_ to read or write, but would otherwise be capable.

Is your argument then, that a person who hasn't learned to read or write is able to model language as accurately as one who did?

Wouldn't you say that someone who has read a whole ton of books would maybe be a bit better at language modelling?

Also, perhaps most importantly: GPT (and pretty much any LLM I've talked to) does know the alphabet and its rules. It knows. Ask it to recite the alphabet. Ask it about any kind of grammatical or lexical rules. It knows all of it. It can also chop up a word from tokens into letters to spell it correctly, it knows those rules too. Now ask it about Chinese and Japanese characters, ask it any of the rules related to those alphabets and languages. It knows all the rules.

This to me shows the problem is that it's mainly incapable of reasoning and putting things together logically, not so much that it's trained on something that doesn't _quite_ look like letters as we know them. Sure it might be slightly harder to do, but it's not actually hard, especially not compared to the other things we expect LLMs to be good at. But especially especially not compared to the other things we expect people to be good at if they are considered "language experts".

If (smart/dedicated) humans can easily learn the Chinese, Japanese, Latin and Russian alphabets, then why can't LLMs learn how tokens relate to the Latin alphabet?

Remember that tokens were specifically designed to be easier and more regular to parse (encode/decode) than the encodings used in human languages ...

maoberlehner · 2025-08-08T15:48:58 1754668138

So LLMs don’t know the alphabet and its rules?

awestroke · 2025-08-08T18:06:32 1754676392

You know about ultraviolet but that doesn't help you see ultraviolet light

maoberlehner · 2025-08-08T20:49:38 1754686178

Yes. And I know I can’t see it and don’t pretend I can, and that it in fact is green.

LordDragonfang · 2025-08-08T23:38:16 1754696296

Not green, no.

But actually, you can see an intense enough source of (monochromatic) near-UV light, our lenses only filter out the majority of it.

And if you did, your brain would hallucinate it as purplish-blueish white. Because that's the closest color to those inputs based on your what your neural network (brain) was trained on. It's encountering something uncommon, so it guesses and present it as fact.

From this, we can determine either that you (and indeed all humans) are not actually intelligent, or alternatively, intelligence and cognition are complicated and you can't conclude its absence from the first time someone behaves in a way you're not trained to expect from your experience of intelligence.