Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Agreed, it bothers me that this tactic is so successful. I could not find an instance in this "discussion" where Gebru outlined any concrete steps to solve for, or lessen bias, in ML. It seems she mainly wants to complain, rather than fix the problem. I have been googling about ideas to lessen ML bias, and the best I've found is "diverse training datasets" which Gebru herself says "is not enough".

Well what is then? Outrage without a solution is useless.



The discussion mentioned the Tutorial on Fairness Accountability Transparency and Ethics in Computer Vision at CVPR 2020. The tutorial was recorded in three parts. The third part[1] includes solutions. These are either direct quotes or paraphrases of the bullet points contained therein.

1. Disaggregated Evaluations[2][3], Counterfactual Testing[4], Interpretability Methods[5]

2. Recognize [l]imitations(sic) of technical approaches (this discussion)

3. Model documentation frameworks[2]

4. Standardized framework for transparent dataset documentation [6]

5. Positionality awareness [7]

6. Actively follow the perspectives of people in marginalized groups

7. Make intentional design choices to privilege the perspectives of marginalized stakeholders who are at most risk of being harmed by the technology we develop

8. Value interdisciplinary and 'non-technical' work

Note: These are some of the sources given as examples. Some are omitted for the sake of time.

Also note: Just because there are points doesn't obligate you to agree with them

[1]. https://www.youtube.com/watch?v=vpPpwa7W93I&t=1499s

[2] https://arxiv.org/pdf/1810.03993.pdf

[3] http://proceedings.mlr.press/v81/buolamwini18a/buolamwini18a...

[4] https://research.google/pubs/pub46743/

[5] https://arxiv.org/pdf/1711.11443.pdf

[6] https://arxiv.org/pdf/1803.09010.pdf

[7] https://dl.acm.org/doi/abs/10.1145/3351095.3375666


To clarify, The Tutorial on Fairness Accountability Transparency and Ethics in Computer Vision is not a technical presentation, it is very much a intersectionalistic take on the topic, more concerned with power, oppression, and social justice than engineering.

Suggested technical solutions [1 above] was exactly one sentence and the only sentence in the presentation that suggested any sort of technical approach, presented without elaboration, and the recommended solutions like Counterfactual Testing are also methods for fixing the training data.

The remaining 99% of the presentation was spent on the idea that if AI engineers and scientists were more diverse and attended more diversity training, the models they produce hopefully wouldn't be biased.


I think this is a fundamental misunderstanding.

Suppose I dismissed cryptanalysis by suggesting anyone involved in breaking cryptography ought to produce better themselves.

Suppose I dismissed a piece of medical research that shows some drug is no better than a placebo, and I demanded that those researchers produce their own drug that works better than the one they presume to criticise?

Critique is valuable. The entire field of science itself is built on absorbing critique, and checks and balances to ensure we reason out of robustness rather than hasty assertions.

In other words: suppose Gebru identifies that some particular ML model is racially biased. That insight, if true, is in itself a valuable contribution to the field. Suppose Gebru further develops an argument that the nature of this bias is (or is not) one in which different choices in training data will not substantially solve. That, too, is itself a valuable contribution to the field, and this kind of work is not at all the same thing as pointless "outrage".


I think the Yann's main point is that this specific model's racial bias was a training data problem; Gebru has not provided any evidence otherwise, as far as I know, apart from repeatedly asserting that better training data won't fix the problem.

It would indeed be a valuable contribution if Gebru could have shown how better training data would not have fixed the problem (e.g. by feeding better training data to the same model and still reproducing the issue).

But as it is, she is asserting that better training data won't help without proof and not recommending any alternatives, while taking a hostile tone of conversation. I would be hard pressed to take her claims as good faith criticism.


Adding some more sources to show that these scholars are not raising issues without concrete solutions -- Gebru and others have proposed many solutions and more concrete problems than this easily solved question of statistical bias. They are mostly not focused on the issue of statistical bias, which is more or less beside the point to everyone, as LeCun and Gebru both pointed out.

Podcast:

- https://www.wnycstudios.org/podcasts/science-friday/segments...

Articles:

- https://www.propublica.org/article/machine-bias-risk-assessm...

- https://themarkup.org/locked-out/2020/05/28/access-denied-fa...

- https://ainowinstitute.org/AI_Now_2019_Report.pdf

Book:

- https://www.ruhabenjamin.com/race-after-technology


I need to double check but from what I can recall, while she didn't mention any solutions or alternatives, she did point him to her previous work about the topic. (but she wasn't specific and I don't think she even included a link)


If you have diverse datasets, but your [prison recidivism, credit risk, or whatever] model doesn't like [giving light sentences to, issuing credit to] things correlated with being black... what then?

Or, for facial recognition, etc-- if you have done all the research with white faces, and then someone complains the system doesn't work for black people-- what then? Just saying "oh, with a different training dataset" to handwave things away doesn't mean that the resultant system will work well for minorities. The camera/optical system/etc may not like black people, too, and without additional validation you can't find these effects.

IMO, you don't need to know how to fix a problem where injustice is occurring to complain about and protest the injustice.


To expand a bit on your question directly and agree with your other points obliquely

> If you have diverse datasets, but your ... model doesn't like ... things correlated with being black... what then?

How do you release your data/models today? Does your release process include a section detailing what the data implies or the limitations of the mode? How would your boss handle being exposed to these findings? If your boss still wants to use the data set or deploy the model knowing the limitations, do you care enough to do something about it? Maybe you have a different set of answers to these questions. Does that make asking them any less important?


Hire more minorities in ML. One of the few industries where forced diversity is a good thing since your models will be better (wider range of opinions and sources).

See, it's not actually hard. Gebru is just correct in saying you are not listening.


The underlying (racially biased) superresolution algorithm was called PULSE and the code was released as an artifact of this paper:

Paper: https://arxiv.org/pdf/2003.03808v1.pdf

Here are the authors: Sachit Menon, Alexandru Damian, Shijia Hu, Nikhil Ravi, Cynthia Rudin

They conveniently captured an image of them all in one place for us here - https://cdn.telanganatoday.com/wp-content/uploads/2020/06/Au...

Pretty diverse team if you ask me. Sprinkling underrepresented (and therefore 'underpowered') representation into engineering teams will solve a lot of issues, but I think it has yet to be proven that it's going to help ML bias.


You give too much credit to the people who build those systems. Having more minorities in ML is a good thing, but I don't think it's gonna change anything regarding bias in ML systems.


ML, like most programming fields, has quite a few minorities - many organizations are majority-minority.


Yes, but that doesn't mean representative of all minorities. Very few Black people and Native Americans, for instance, even if Asian people are represented well.


This is immaterial. What about blind folks or quadriplegics or folks with chromosomal defects. Adding a black person to a team and expecting them to somehow fix anything beyond what they are trained at working on is unfair and unrealistic.

I mentioned before but this is the team that created the PULSE algorithm that made Obama look like a white guy from Arizona - https://cdn.telanganatoday.com/wp-content/uploads/2020/06/Au...


When the rationale is that having someone with an experience of being disadvantaged might be particularly inclined to help build a less biased system, one hardly refutes it by pointing out there are a lot of [not very disadvantaged] minorities on the team.

But one might earn some pedantry points with the argument.


What's the chance that a black engineer on an ML team doing work like this isn't also from the 'corporate(or educated) class'.

https://news.ycombinator.com/item?id=23697472

Black cops shoot black men all day long. I don't see why black ML folks are going to be completely immune to all of the pressures that have allowed the domain to get to its current state. While I'm sure many would accept the challenge, it's waaay too much to ask or expect of them.

Yes absolutely this issue needs diverse perspectives and the priorities they bring, but it will fail without objective standards to ensure that technology in the field reliably meets our expectations.


> Yes absolutely this issue needs diverse perspectives and the priorities they bring, but it will fail without objective standards to ensure that technology in the field reliably meets our expectations.

It sounds like we mostly agree. It certainly doesn't hurt to have someone who's been the subject of unfair profiling before to get his spidey sense going at the implications of the system, but it isn't a cure-all.


Yep!


Sure, and in some cases that does matter. But it seems unlikely that the PULSE model was biased towards white people because of latent racism, given that 3 of the 5 researchers who built it weren't white.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: