It's presented as a feature when GPT provides a correct answer.
It's presented as a limitation when GPT provides an incorrect answer.
Both of these behaviors are literally the same. We are sorting them into the subjective categories of "right" and "wrong" after the fact.
GPT is fundamentally incapable of modeling that difference. A "right answer" is every bit as valid as a "wrong answer". The two are equivalent in what GPT is modeling.
Lies are a valid feature of language. They are shaped the same as truths.
The only way to resolve this problem is brute force: provide every unique construction of a question, and the corresponding correct answer to that construction.
Not entirely. It's modeling a completion in a given context. That language model "understands" that if one party stops speaking, the other party generally starts, etc. It also "understands" that if someone says something 'wrong' the other party often mentions it, which makes the first party respond thusly, and so forth.
If you ask it what the outcome of a lie is on the conversation it can generally answer. If you ask it for a sample conversation where someone is factually incorrect, or lying, and caught out, it can generate it.
If you give it a fact and ask it to lie about that fact, it will.
I'd agree it doesn't understand anything, but I think it does "understand" things. And yes, it's a language model so semantic distance and other textual details are all it has to go by.
> not by logical decision
Almost entirely yes, but you can have it textually model logic analysis and then check that own model itself. It's not "doing logic" but it almost never fails simply exercises either.
> The meaning of semantic distance usually leads to the correct path, but sometimes that pattern is ambiguous.
Of course. But "a little knowledge is a dangerous thing" as well. Often even real knowledge and analysis leads to the wrong place. In both cases (with a junior human or a LLM as an assistant) you can model their basic processes and stack the information in such a way that their simple model will lead them to the correct place.
It may not know what a lie is, in the sense of having felt the need to hide the truth to avoid personal punishment, but it certainly "knows" what one is and how it shapes the conversation for the purposes of writing a lie, writing a response to a lie, detecting potential lies, etc.
It's presented as a limitation when GPT provides an incorrect answer.
Both of these behaviors are literally the same. We are sorting them into the subjective categories of "right" and "wrong" after the fact.
GPT is fundamentally incapable of modeling that difference. A "right answer" is every bit as valid as a "wrong answer". The two are equivalent in what GPT is modeling.
Lies are a valid feature of language. They are shaped the same as truths.
The only way to resolve this problem is brute force: provide every unique construction of a question, and the corresponding correct answer to that construction.