I'm coming from a different political bias than most here, as an Anarcho-Capitalist. To put it bluntly and very broadly, think of my perspective as extreme right-libertarian.
I think frontier model companies are finding that their base models exhibit some problematic "views". I don't have direct knowledge here, but here's a hypothetical example to illustrate my theory:
_NOTE_: I'm trying to illustrate a point here, not stating a political or social view. This is technical point, not a political/ethical one. Please don't take this as a statement of my own beliefs -- that is explicitly not my intent.
---
Let's say ChatGPT 4o, before fine-tuning, would confidently state that a black male is more likely to be convicted of a violent crime than a white male. Considering only demographic and crime statistics, that's true. It's also likely not what the user was asking, and depending on presentation could represent a huge reputational risk to OpenAI.
So OpenAI would then presumably build and maintain a set of politically-sensitive prompts and their desired responses. That set would then be used to fine-tune and validate the model's outputs, adjusting them until the model no longer makes "factually true but incorrect and potentially socially abhorrent" statements.
The impacts of this tuning are only validated within the scope of their test set; who know what impact they have on other responses. That's, at best, handled by a more general test set.
My theory here is that frontier model companies are unintentionally introducing "left-wing bias" in an attempt to remove what is seen as "right-wing bias", but is actually a lack of emotional intelligence and/or awareness of social norms.
I think frontier model companies are finding that their base models exhibit some problematic "views". I don't have direct knowledge here, but here's a hypothetical example to illustrate my theory:
_NOTE_: I'm trying to illustrate a point here, not stating a political or social view. This is technical point, not a political/ethical one. Please don't take this as a statement of my own beliefs -- that is explicitly not my intent.
---
Let's say ChatGPT 4o, before fine-tuning, would confidently state that a black male is more likely to be convicted of a violent crime than a white male. Considering only demographic and crime statistics, that's true. It's also likely not what the user was asking, and depending on presentation could represent a huge reputational risk to OpenAI.
So OpenAI would then presumably build and maintain a set of politically-sensitive prompts and their desired responses. That set would then be used to fine-tune and validate the model's outputs, adjusting them until the model no longer makes "factually true but incorrect and potentially socially abhorrent" statements.
The impacts of this tuning are only validated within the scope of their test set; who know what impact they have on other responses. That's, at best, handled by a more general test set.
My theory here is that frontier model companies are unintentionally introducing "left-wing bias" in an attempt to remove what is seen as "right-wing bias", but is actually a lack of emotional intelligence and/or awareness of social norms.