> This generates Answers which are sequenced according to "frequency-guided heuristic searching" (I guess a kind of "stochastic A* with inlined historical data")
This sounds way too simplistic of an understanding. Transformers aren't just heuristically pulling token cards out of a randomly shuffled deck, they sit upon a knowledge graph of embeddings that create a consistent structure representing the underlying truths and relationships.
The unreliability comes from the fact that within the response tokens, "the correct thing" may be replaced by "a thing like that" without completely breaking these structures and relationships. For example: In the nightmare scenario of a STAWBERRY, the frequency of letters themselves had very little distinction in relation to the concept of strawberries, so they got miscounted (I assume this has been fixed in every pro model). BUT I don't remember any 2023 models such as claude-3-haiku making fatal logical errors such as saying "P" and "!P" while assuming ceteris paribus unless you went through hoops trying to confuse it and find weaknesses in the embeddings.
You've just given me the heuristic, and told me the graph -- you haven't said A* is a bad model, you've said it's exactly the correct one.
However, transformers do not sit on a "knowledge graph", since the space is not composed of discrete propositions set in discrete relationships. If it were, then P(PrevState|NextState) = 0 would obtain for many pairs of states -- this would destroy the transformers ability to make progress.
So rather than 'deviation from the truth' being an accidental symptom, it is essential to its operation: there can be no distinction-making between true/false propositions for the model to even operate.
> making fatal logical errors such as saying "P" and "!P"
Since it doesn't employ propositions directly, how you interpret its output in propositional terms will determine if you think it's saying P&!P. This "interprerting-away" effect is common in religious interpretations of texts where the text is divorced from its meaning, a new one substituted, to achieve apparent coherence.
Nevertheless, if you're asking (Question, Answer)-style prompts where there is a cannonical answer to a common question, then you're not really asking it to "search very far away" from its inlined historical data (the ersatz knowledge-graph that it does not possess).
These errors become more common when the questions require posing several counterfactual scenarios derived from the prompt or otherwise have non-cannonical answers which require integrating disparate propositions given in a prompt.
The prompt's propositions each compete to drag the search in various directions, and there is no constraint on where it can be dragged.
I am not going to engage with your A* proposition. I believe it to be irrelevant.
> However, transformers do not sit on a "knowledge graph", since the space is not composed of discrete propositions set in discrete relationships.
This is the main point of contention. By all means, embeddings are a graph, as you can use a graph to represent its datastructure, but not a tree. Sure, they are essentially points in space, but a graph emerges as the architecture starts selecting tokens for use according to the learned parameters during inference. It will always be the same graph for the same set of tokens for a given data set which provides "ground truth". I know it sounds metaphoric but bare with me.
The above process doesn't result in discrete propositions like we have in prolog, but the point is, it is "relatively" meaningful, and you seed a traversal by bringing tokens to the attention grid. What I mean by relatively meaningful is that inverse relationships are far enough that they won't usually be confused, so there is less chance of meaningless gibberish emerging which is what we observe.
This sounds way too simplistic of an understanding. Transformers aren't just heuristically pulling token cards out of a randomly shuffled deck, they sit upon a knowledge graph of embeddings that create a consistent structure representing the underlying truths and relationships.
The unreliability comes from the fact that within the response tokens, "the correct thing" may be replaced by "a thing like that" without completely breaking these structures and relationships. For example: In the nightmare scenario of a STAWBERRY, the frequency of letters themselves had very little distinction in relation to the concept of strawberries, so they got miscounted (I assume this has been fixed in every pro model). BUT I don't remember any 2023 models such as claude-3-haiku making fatal logical errors such as saying "P" and "!P" while assuming ceteris paribus unless you went through hoops trying to confuse it and find weaknesses in the embeddings.