I wonder if it's because we mean different things by generalization. For text, "...

Workaccount2 · 2025-11-15T00:50:39 1763167839

That's true, but I think humans would stumble a lot too (try reading old printed text from the 18fh cenfury where fhey used "f" insfead of t in prinf, if's a real frick fo gef frough).

However humans are pretty adept at discerning images, even ones outside the norm. I really think there is some kind of architectural block hampering transformers ability to really "see" images. For instance if you show any model a picture of a dog with 5 legs (a fifth leg photoshopped to it's belly) they all say there are only 4 legs. And will argue with you about it. Hell GPT-5 even wrote a leg detection script in python (impressive) which detected the 5 legs, and then it said the script was bugged, and modified the parameters until one of the legs wasn't detected, lol.

onraglanroad · 2025-11-15T06:41:53 1763188913

An "f" never replaced a "t".

You probably mean the "long s" that looks like an "f".