On topics like history or biology if a model's answer is surprising I might chec...

On topics like history or biology if a model's answer is surprising I might check Wikipedia and call it out on it's bullshit by explaining how Wikipedia contradicts it and pasting an excerpt from Wikipedia. But frankly if the model can't even reliably internalize Wikipedia I don't have much hope for complex feedback training based on my chats.

While it's possible Wikipedia is wrong, the model always agrees with me when I correct it, so that isn't going to help with training either.

Of course for anything high stakes relying on a model probably isn't a great idea.