That's why I said "I think there is a good chance" - I think what you describe here (anticipatory obedience) is possible too, but I honestly wouldn't be surprised to hear that the from:elonmusk searches genuinely were unintended behavior.
I find this as accidental behavior almost more interesting than a deliberate choice.
I side with Occam's razor here, and with another commenter in this thread. People are construing entire conspiracy theories to explain fake replies when asked for system prompt, lying in Github repos, etc.
I find this as accidental behavior almost more interesting than a deliberate choice.