Every time I see people online reduce the human thinking process to just production of a perceptible output, I start questioning myself, whether somehow I am the only human on this planet capable of thinking and everyone else is just pretending. That can't be right. It doesn't add up.
The answer is that both humans and the model are capable of reasoning, but the model is more restricted in the reasoning that it can perform since it must conform to the dataset. This means the model is not allowed to invest tokens that do not immediately represent an answer but have to be derived on the way to the answer. Since these thinking tokens are not part of the dataset, the reasoning that the LLM can perform is constrained to the parts of the model that are not subject to the straight jacket of training loss. Therefore most of the reasoning occurs in-between the first and last layers and ends with the last layer, at which point the produced token must cross the training loss barrier. Tokens that invest into the future but are not in the dataset get rejected and thereby limit the ability of the LLM to reason.
The answer is that both humans and the model are capable of reasoning, but the model is more restricted in the reasoning that it can perform since it must conform to the dataset. This means the model is not allowed to invest tokens that do not immediately represent an answer but have to be derived on the way to the answer. Since these thinking tokens are not part of the dataset, the reasoning that the LLM can perform is constrained to the parts of the model that are not subject to the straight jacket of training loss. Therefore most of the reasoning occurs in-between the first and last layers and ends with the last layer, at which point the produced token must cross the training loss barrier. Tokens that invest into the future but are not in the dataset get rejected and thereby limit the ability of the LLM to reason.