The unconditional probability P(male) is around 1/2 or a little less.
Then Prob(male | knows word) = Prob(knows word | male) P(male) / P(knows word).
---
Now, we're not given P(knows_word), but assuming answers are from a reasonably balanced sample, we know that
P(knows word) = P(male)P(knows word|male) + P(female)P(knows word|female) = [P(knows|m]+P(knows|f)]/2
Going back:
Prob(m | knows) = 0.5Prob(knows | m) / 0.5[P(knows|m]+P(knows|f)]
Which gives us a formula. E.g. for peplum, Prob(m|knows) is
13%/(13%+64%) = 13/77 ~= 16.8%
For "shemale":
88%/(88%+54%) = 88/142 ~= 62.0%
So sometimes the actual "maleness" or "femaleness" of the word is overstated, while sometimes its underestimated.
This isn't a critique of an article, it's a literal comment.
--- Edit:
The drive to procrastinate today is strong. Here are the probabilities for all words.
https://colab.research.google.com/drive/1-UP3qTJ3GZ3BpsA0ZNa...
The unconditional probability P(male) is around 1/2 or a little less.
Then Prob(male | knows word) = Prob(knows word | male) P(male) / P(knows word).
---
Now, we're not given P(knows_word), but assuming answers are from a reasonably balanced sample, we know that
P(knows word) = P(male)P(knows word|male) + P(female)P(knows word|female) = [P(knows|m]+P(knows|f)]/2
Going back:
Prob(m | knows) = 0.5Prob(knows | m) / 0.5[P(knows|m]+P(knows|f)]
Which gives us a formula. E.g. for peplum, Prob(m|knows) is
13%/(13%+64%) = 13/77 ~= 16.8%
For "shemale":
88%/(88%+54%) = 88/142 ~= 62.0%
So sometimes the actual "maleness" or "femaleness" of the word is overstated, while sometimes its underestimated.
This isn't a critique of an article, it's a literal comment.
--- Edit:
The drive to procrastinate today is strong. Here are the probabilities for all words.
https://colab.research.google.com/drive/1-UP3qTJ3GZ3BpsA0ZNa...