You would need to extract logical patterns and concepts somehow, not just word relationships. I know what you mean, this introduces another level of abstraction between relationships. If there is no way to extract these patterns, or if there are no real logical patterns present but only statistical relationships (larger model = more relationships = better prompt following etc) between words without any real 'emergent abilities' then Transformers are essentially a dead end in the context of AGI.