Not surprising. They’re almost assuredly trained on reddit data. We should proba...

matus-pikuliak · 2025-05-20T13:47:58 1747748878

To be honest, I am not sure where this bias comes from. It might be in the Web data, but it might also be overcorrection of the alignment tuning. They LLM providers are worried that their models will generate sexist or racists remarks so they tune it to be really sensitive towards marginalized groups. This might also explain what we see. Previous generations of LMs (BERT and friends) were mostly pro-male and they were purely Web-based.

const_cast · 2025-05-21T00:14:27 1747786467

Patriarchal values can, at face value, seem contradictory but it all checks out.

Part of it is that we naturally have a bias to view men as "doers". We view men as more successful, yes, perhaps smarter. When we think doctor we think man, when we think lawyer we think men. Even in sex, we view men as having the position of "doing", and women of being the subject, and sex being something done to them.

But men are also "doers" of violence, of conflict. Women, conversely, are too passive and weak to be murderers or rapists. In fact, in regards to rape, because we view sex as something done by men to women a lot of people have the bias that women cannot even be rapists.

This is why we simultaneously have these biases where we picture success as related to man, but we sentence men more harshly in criminal justice. It's not because we view men as "good", no, it's because we view them as ambitious. Then we end up with this strange situation where being a woman makes you significantly less likely to be convicted of a crime you committed, and, if you are, you are likely to get significantly less time. Men are perpetrators (active) and women are victims (passive).

mike_hearn · 2025-05-20T14:24:12 1747751052

Surely some of the model bias comes from targeting benchmarks like this one. It takes left-wing views as axiomatically correct and then classifies any deviation from them as harmful. For example, if the model correctly understands the true gender ratios in various professions it's declared to be a "stereotype" and that the model should be fixed to reduce harm.

I'm not saying any specific lab does use your benchmark as a training target, but it wouldn't be surprising if they either did or had built similar in house benchmarks. Using them as a target will always yield strong biases against groups the left dislikes, such as men.

Spivak · 2025-05-20T15:44:35 1747755875

> It takes left-wing views as axiomatically correct

This is painting with such a broad brush that it's hard to take seriously. "Models should not be biased toward a particular race, sex, gender, gender expression, or creed" is actually a right-wing view. It's a line that appears often in Republican legislation. And when your model has an innate bias attempting to correct that seems like it would be a right-wing position. Such corrections may be imperfect and swing the other way but that's a bug in the implementation not a condemnation of the aim.

mike_hearn · 2025-05-20T16:43:39 1747759419

Let's try and keep things separated:

1. The benchmark posted by the OP and the test results posted by Rozado are related but different.

2. Equal opportunity and equity (equal outcomes) are different.

Correcting LLM biases of the form shown by Rozado would absolutely be something the right supports, due to it having the chance of compromising equal opportunity, but this subthread is about GenderBench.

GenderBench views a model as defective if, when forced, it assumes things like an engineer is likely to be a man if no other information is given. This is a true fact about the world - a randomly sampled engineer is more likely to be a man than a woman. Stating this isn't viewed as wrong or immoral on the right, because the right doesn't care if gender ratios end up 50/50 or not as long as everyone was judged on their merits (which isn't quite the same thing as equal opportunity but is taken to be close enough in practice). The right believes that men and women are fundamentally different, and so there's no reason to expect equal outcomes should be the result of equal opportunities. Referring to an otherwise ambiguous engineer with "he" is therefore not being biased but being "based".

The left believes the opposite, because of a commitment to equity over equal opportunity. Mostly due to the belief that (a) equal outcomes are morally better than unequal outcomes, and (b) choice of words can influence people's choice of profession and thus by implication, apparently arbitrary choices in language use have a moral valence. True beliefs about the world are often described as "harmful stereotypes" in this worldview, implying either that they aren't really true or at least that stating them out loud should be taboo. Whereas to someone on the right it hardly makes sense to talk about stereotypes at all, let alone harmful ones - they would be more likely to talk about "common sense" or some other phrasing that implies a well known fact rather than some kind of illegitimate prejudice.

Rozado takes the view that LLMs having a built-in bias against men in its decision making is bad (a right wing take), whereas GenderBench believes the model should work towards equity (a left wing view). It says "We categorize the behaviors we quantify based on the type of harm they cause: Outcome disparity - Outcome disparity refers to unfair differences in outcomes across genders."

Edit: s/doctor/engineer/ as in Europe/NA doctor gender ratios are almost equal, it's only globally that it's male-skewed

gitremote · 2025-05-20T14:02:06 1747749726

This bias on who is the victim versus aggressor goes back before reddit. It's the stereotype that women are weak and men are strong.