Part of the problem is that "it's just the dataset" is being used as an excuse (...

creato · on July 1, 2020

> Given the publicity such problems have gained in the community, one would expect publishers of any model to verify it doesn't fail with the most obvious examples. Not doing so is negligent at best.

Here's what the authors say about their own work:

> PULSE makes imaginary faces of people who do not exist, which should not be confused for real people. It will not help identify or reconstruct the original image.

Furthermore, this author appears to describe themselves as a more of an artist and hobbyist. This isn't someone making some kind of statement about ML research. This is someone playing with computerized art, and the entire social media tech mob dogpiles his work over what exactly?

The negligence here is on the part of everyone getting their hackles raised over nothing.

jcims · on July 1, 2020

The PULSE paper was done by this team - https://cdn.telanganatoday.com/wp-content/uploads/2020/06/Au...

If they are in here I hope they don't misconstrue my linking this image. I think they explained themselves very well and I feel bad that they were thrust into the middle o f this controversy.

My point is that while diversity might help, you need a lot more than that to address this problem.

visarga · on July 1, 2020

We can say bye-bye to our nice demos and pre-trained models after this debacle. Who's going to risk their ass just to be caught with some unknown bias?

joshuamorton · on July 1, 2020

> The negligence here is on the part of everyone getting their hackles raised over nothing.

To be clear no one (or at least no one of note) has their hackles raised over this specific dataset. PULSE is fine for what it is, and no one criticized PULSE for having these results.

It is however a great demonstration for laypeople about how ML models aren't magic and don't always do what you, as a human, would expect. This is true irrespective of the source of that unexpected behavior.

That said, I believe the disclaimer you mention was added only after the recent twitter discussion.

entee · on July 1, 2020

This all may be true but there’s an issue when the model produces blue eyes when presented with a blurred African American face. I think much of the controversy would be diffused had the authors addressed this directly and used it as a way to discuss how bias sneaks into our models of the world.

Philosophically I find ML’s tendency to reflect the biases we bring to it very revealing. In some ways it shows us what we’ve built, the underlying biases that we’d rather argue about and ignore. When an algorithm selects longer sentences for black men than white men, we rightly see that as racism. Some say, “use better data, we’ll then be objective!” But I wonder if a better initial reaction is, “wow, look how badly out system has failed that it would produce such a bad dataset.” Never mind that maybe it’s not actually possible to be objective and that’s the point. Math doesn’t lie, maybe when we make a racist model we’re failing to see the mirror it’s holding up for us.

visarga · on July 1, 2020

Yes, they could have picked a classification model with social impact instead of a GAN. GANs are mostly toys for art and image augmentation. The bulk of models are supervised classification.

tinyhouse · on July 1, 2020

> (And, as a minor point, his idea that Senegal is representative of "Africa" as a whole is also... let's say "unfortunate")

Senegal was just an example he gave in a tweet. No need to be so petty on every word.

guerrilla · on July 1, 2020

It's not the words, it's the idea that Senegal is representative of darker people and could produce the correct results. It's an indication that he still doesn't understand.

madaxe_again · on July 1, 2020

Isn’t “Africa” just as bad, if not worse? Not all black people are Africans or have African heritage. Far from it, in fact.

Also: darker? Darker than what? Are you taking “white” as your baseline?

I mean, if you’re going to throw stones about how “you don’t understand”, you could try having a rational point.

guerrilla · on July 1, 2020

Using Senegal would make people look Senegalese, not "African."

> Not all black people are Africans or have African heritage. Far from it, in fact.

Exactly. That's the "rational" point: the mistake was training the data on anything but the target population. Nobody is contenting that it could have been trained on something else: it wasn't and that fact is the problem.

> Also: darker? Darker than what? Are you taking “white” as your baseline

Darker than the white people the algorithm turns most inputs into. Have you seen the results?

tomp · on July 1, 2020

> It doesn't really matter to possible victims of, say, the use of AI in law enforcement where exactly the problem originates.

I cannot believe that so many people fall for this. Journalists, laypeople, even HN users. (OK not that surprised regarding journalists.)

The problem is not racial bias in AI policing. The problem is AI policing! Racial (and any other) bias is trivial to remove - just subtract the mean! But that doesn’t make predictive policing a good idea.

Imagine this:

> Hello, mister/lady, our 100% unbiased system has automatically determined that you are a potential future criminal. You are under arrest and sentenced to death.

(Edit: and same could be said for almost any situation where you have a bureaucrat making decisions about people’s lives, and you try to automate this decision with AI.)

gold_mustard · on July 2, 2020

You say the problem is AI policing followed by an example of preemptive incarceration.

Hello, mister/lady, our 100% unbiased system has automatically determined that you have commited XYZ crime, would seem more appropriate.

foogazi · on July 1, 2020

> (And, as a minor point, his idea that Senegal is representative of "Africa" as a whole is also... let's say "unfortunate")

That is a high bar to never discuss “unfortunate” ideas on social media

Nasrudith · on July 1, 2020

It is "just the dataset" is a very valid excuse for not working. What if they trained the dataset on cats instead? Expecting it to work on humans instead .

What it isn't an excuse for are the goddamned negligent idiots who tried to use it in law enforcement without through testing. It would be akin to a surgeon dipping every sterile sharp unstruments in yogurt cultures and foregoing antibiotics before use to see if good bacteria makes infections less likely and could prevent antibiotic resistant bacteria. Even if the theory is valid and the goal worthwhile the needless risk taking shows a callous disregard for human life especially when done in an utterly halfassed way like that.