To some extent, I think the axiom comparison is enlightening. In principle, axioms determine the entire space of all mathematical truths in that system. However, in practice, not all truths in a mathematical system are as easy to discover or verify. Knowing the axioms of arithmetic doesn't mean you don't get anything out of computing how much 271819 × 637281 actually is.
The claims about LLM reasoning are precisely related to this point. Do LLMs follow some internal deductive process when they generate output that resembles such a logical process to humans? Or are they just producing text that looks like reasoning, much like an absurdist play might do, and then simply picking a conclusion that resembles other problems they've seen in the past?
I don't think any arguments about the base nature of the model are particularly helpful here. In principle, deductive reasoning can be expressed as a mathematical function, and for any function, there is a neural net that can approximate it with arbitrary precision. So it's not impossible that the model actually does this, but it's also not a given - this first principles approach is just not helpful. We need more applied study of how the model actually works to probe this deeper.
The claims about LLM reasoning are precisely related to this point. Do LLMs follow some internal deductive process when they generate output that resembles such a logical process to humans? Or are they just producing text that looks like reasoning, much like an absurdist play might do, and then simply picking a conclusion that resembles other problems they've seen in the past?
I don't think any arguments about the base nature of the model are particularly helpful here. In principle, deductive reasoning can be expressed as a mathematical function, and for any function, there is a neural net that can approximate it with arbitrary precision. So it's not impossible that the model actually does this, but it's also not a given - this first principles approach is just not helpful. We need more applied study of how the model actually works to probe this deeper.