Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> I don't follow this. If we have a biased objective function, the model won't surface any biases we weren't already cognizant of in the objective function. And they were already quantifiable: we had a function that we were using to evaluate the model. We could use that same function on whatever non-model evaluation we were doing.

We can actually follow the logic of the model. For instance, you can theoretically de-bias a dataset by building a racial classifier from it. What you need is an objective test for the presence of racial information, and that's easy to obtain: Build a classifier to explicitly predict race from your feature set. Train an adversarial model to reconstruct your dataset with maximum fidelity, subject to the constraint that race can no longer be predicted from it.

> This is basically directly in contradiction to what leading experts on the subject say. ML cannot fix bias in human systems, unless we presuppose that those systems are biased, in which case we can often address the bias in the human systems directly without ML.

These experts are just wrong, then. Naive ML won't fix bias in human systems, but that doesn't mean we can't use ML to fix it, if we do so thoughtfully.

> You can still have decisions be made by objective expert systems without complex ML. If you want to learn someone's IQ, the best way is to debias the IQ test, not to try and infer it from their face bones.

Sure, but there are a lot of things that we don't do in the best possible way because it's too expensive. There are lots of use cases for cheap, scalable, low precision models.

> If we can measure the bias in the output of an ML model, we can equivalently measure the bias in the output of a human system. You're presupposing the existence of some unbiased objective function which we don't have, and that's at the core of the issue.

Right, but we cannot fix the bias in a human. And humans are heterogenous and inconsistent. The same person may be more or less biased on different days. The ML model is consistent, and we can incrementally improve its bias in tangible and testable ways. The same is not true of humans.



> de-bias a dataset

And what does this get you? Let's look at a face recognition dataset. What happens when you debias it? Is it still useful? No. Because the faces no longer resemble real faces.

> These experts are just wrong, then

Perhaps, but you aren't making a strong case for that.

> There are lots of use cases for cheap, scalable, low precision models.

That involve facial recognition?

> Right, but we cannot fix the bias in a human

We don't need to. We just need to fix the bias in the system. And we absolutely can incrementally reduce bias in systems that involve humans.


> And what does this get you? Let's look at a face recognition dataset. What happens when you debias it? Is it still useful? No. Because the faces no longer resemble real faces.

Not to you. But you can remove the racial information without destroying all the information that a model can detect.


But when racial information is correlated with the output, to decorate with race, you destroy the input. This is most obvious with a face dataset, but is true with anything race correlated: credit scores, where you live, etc. If you're willing to destroy the training data so it no longer resembles real world information, you might as well just not use it in the first place.

That's what the ethicists say: don't use facial recognition models. Don't work on them. Don't research them. They cannot be both unbiased and useful. And in general, there's few to no uses that are ethical, period.


Well, the ethicists just don't understand the models, then. For instance, there are a bunch of measurements you can take of faces to identify people, if you were doing it manually. Things like pupillary distance, canthal tilt, nose width, etc.

Some of these correlate with race. But only part of the information correlates with race, not all of it. It is, in principle, possible to remove the information that identifies race without destroying the information that identifies the individual. It is true that part of an individual's essential characteristics are their racial characteristics, but it is not true that the only way to identify an individual is their racial characteristics. For instance, there is no way that i'm aware of to infer race from fingerprints, but you can absolutely identify a person by their fingerprints. So, the question is, can we extract a facial fingerprint that identifies a person, but not their race? I think the answer is almost certainly yes, and it is going to be up to a clever model design to do it. But essentially it would look like a GAN where the adversarial component is constantly trying to predict race, while the Generative component is trying to trick the race classifier without tricking the person-identifier.


> Well, the ethicists just don't understand the models, then. For instance, there are a bunch of measurements you can take of faces to identify people, if you were doing it manually. Things like pupillary distance, canthal tilt, nose width, etc.

Or perhaps they understand that this won't work in practice.


That could be. But afaik it hasn't been attempted yet.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: