It is evident that it is not recalling the sum because all combinations of integer addition were likely not in the training data, Storing the answer to the sum of all integers up to the size that GPT4 can manage would take more parameters than the model has.
That addition is a small capability but you only need a single counterexample to disprove a theory.
> That addition is a small capability but you only need a single counterexample to disprove a theory
No, that's not how this works :)
You can hardcode an exception to pattern recognition for specific cases - it doesn't cease to be a pattern recognizer with exceptions being sprinkled in.
The 'theory' here is that a pattern recognizer can lead to AGI. That is the theory. Someone saying 'show me proof or else I say a pattern recognizer is just a pattern recognizer' is not a theory and thus cannot be disproven, or proven.
It's not hardcoded, reissbaker has addressed this point.
I think you are misinterpreting what the argument is.
The argument being made is that LLMs are mere 'stochastic parrots' and therefore cannot lead to AGI. The analogy to Russell's teapot is that someone is claiming that Russells teapot is not there because china cannot exist in the vacuum of space. You can disprove that with a single counterexample. That does not mean the teapot is there, but it also doesn't mean it isn't.
It is also hard to prove that something is thinking. It is also very difficult to prove that something is not thinking. Almost all arguments against AGI take the form X cannot produce AGI because Y. Those are disprovable because you can disprove Y.
I don't think anyone is claiming to have a proof that an LLM will produce AGI, just that it might. If they actually build one, that too counts as a counterexample to anybody saying they can't do it.
GPT-4o doesn't have hardcoded math exceptions. If you would like something verifiable, since we don't have the source code to GPT-4o, consider that Qwen 2.5 72b can also add large integers, and we do have the source code and weights to run it... And it's just a neural net. There isn't secret "hardcode an exception to pattern recognition" in there that parses out numbers and adds them. The neural net simply learned to do it.