> the “behavior” breaks down in ways that would be surprising from a human who’s actually paying as much attention as they’d need to be to have been interacting the way they had been until things went wrong.
That's an interesting angle. Though of course we're not surprised by human behavior because that's where our expectations of understanding come from. If we were used to dealing with perfectly-correctly-understanding super-intelligences, then normal humans would look like we don't understand much and our deliberate thinking might be no more accurate than the super-intelligence's absent-minded automatic responses. Thus we would conclude that humans are never really thinking or understanding anything.
I agree that default LLM output makes them look like they're thinking like a human more than they really are. I think mistakes are shocking more because our expectation of someone who talks confidently is that they're not constantly revealing themselves to be an obvious liar. But if you take away the social cues and just look at the factual claims they provide, they're not obviously not-understanding vs humans are-understanding.
That's an interesting angle. Though of course we're not surprised by human behavior because that's where our expectations of understanding come from. If we were used to dealing with perfectly-correctly-understanding super-intelligences, then normal humans would look like we don't understand much and our deliberate thinking might be no more accurate than the super-intelligence's absent-minded automatic responses. Thus we would conclude that humans are never really thinking or understanding anything.
I agree that default LLM output makes them look like they're thinking like a human more than they really are. I think mistakes are shocking more because our expectation of someone who talks confidently is that they're not constantly revealing themselves to be an obvious liar. But if you take away the social cues and just look at the factual claims they provide, they're not obviously not-understanding vs humans are-understanding.