Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is fundamentally a forest/trees type of situation. LeCun sees the issue with this one model, says, "If this had been trained on a different sample it wouldn't have this issue," and stops his train of reasoning there. The problem he is seeing is the mis-trained model, and nothing more.

But the problem is larger than this single model, because this issue (or similar ones) are pervasive in the fields in which AI are being employed. If a neural net is helping a court hand down sentences, it is going to be trained on historical sentencing data, and will in turn reflect the biases present in that data. If you are still only seeing the one tree, you say "well we must correct for the historical bias," and absolve yourself of thinking of the larger problem. That forest problem is that we will always be feeding these algorithms biased inputs, unless we do the work to understand social biases and attempt to rectify them.



It's not the job of the AI researcher to solve "social biases" in every field, it's their job to build the AI. LeCun is right to focus on his task and correct it, not just start talking about structural issues to which he cannot have insight anyway.

PS "Do the work" is a creepy phrase that is popping up everywhere in SocJus. I recommend describing the "work" that needs "doing" instead of just saying "the work" need be "done".


> It's not the job of the AI researcher to solve "social biases" in every field, it's their job to build the AI.

No, this is only technically correct but actually wrong. In the example, they did in fact fail to build the AI, as is their job. "Recognize white faces" is a lame research goal, "recognize human faces" is the real thing. So if somebody builds an AI system that fails on out-of-sample data, then says that had they tried to do it properly they would have succeeded, that's a pretty lame excuse for a poor AI system. They didn't even do their job in the narrow sense that you're using, you don't even need to consider the "social biases" or whatever, it's just a system that didn't work. In fact, many years into this research program, "focussing on his task and correcting it" (working on all human faces) is still not done given the current performance of these systems, but they are quite sure it can be done if they tried.


> No, this is only technically correct but actually wrong. In the example, they did in fact fail to build the AI, as is their job. "Recognize white faces" is a lame research goal, "recognize human faces" is the real thing.

The researcher's job is to make progress towards the research goal. Progress that doesn't solve the problem is still progress. Nobody's saying that the PULSE authors have solved their research goal of generating white faces. It's why papers have a "discussion and future work" section, which touched on the issue in the first version of their paper. It's why in their revised paper, they added a full discussion of bias and the issues in the current model. They did their job, but didn't solve the whole problem, which is too high a bar for any researcher. Science is incremental.

That's how I interpreted LeCun's distinction between the researcher's job and the product builder's job.


Imo these are researchers. Their job was to validate their algorithm for doing generative super-resolution on a dataset, they chose the largest and most well-known dataset, it worked reasonably well on their dataset. The model itself is not productive ready for at least the reason that the dataset is not representative. This is ubiquitous in ML papers, they validate their idea on not completely realistic but widely available datasets. The outcome is a piece of knowledge about the behavior on that dataset not a product.


The point that many are making is that this is a myopic view of research. Why is the generic goal to optimize against some particular dataset? That ends up being a narrow and unhelpful goal.

As someone put on Twitter : we should be rethinking the meta learning algorithm of the ML academic sphere, and a leader like Yann is the kind of person who should be spearheading that.


It is true that better datasets are sorely needed, but what would you suggest Yann and other researchers do about it in this situation?


Proactively doing the things that the ethicists are calling for. Use Model cards for any model. Actively think about and discuss the limitations the model, intended uses, etc. That goes for everyone.

For someone like Yann specifically? Publicly state that the ML scene is optimizing in a myopic way, and invest in doing so less myopically. For example, I think the translation space has a clear goal and the right goal and is making strides in improving language models in many useful ways.

Ultimately, if there aren't ethical and useful ways to apply facial recognition, leaders should be steering people away from those research topics.


This is like saying it's not a civil engineers job to solve the geography's faults. If you can't build a safe and balanced bridge, then don't.

It has real world consequences that can drastically damage communities.


Another example...

The current nightmare that is cyber security is caused by developers who do not understand that with great power comes great responsibility.

The culture of software “engineering” and development is fundamentally broken and based entirely on “its not my problem if someone else gets hurt, I just build things.”

There’s literally not a single other industry where this level of greed and willful neglect is acceptable.

Of course, this would be reflected in the AI community as well.


That shows a fundamental lack of understanding of the fields of software engineering, cyber security and the fields. Both terms are individually overloaded such that many issues aren't even within their domain let alone actually handled by developers! No matter how good the lock is it does no good when the key is left on /top of tbe doormat/ by the user.

Blaming greed of engineers for security problems? Seriously?! That is like railing against EMTs as being responsible for the Coronavirus because they wanted to get rich without working hard. It conflates so many different areas and roles that it isn't even coherent logic and is nonsensical.


This is exactly the attitude to which I’m referring.

If you don’t understand that engineering is the fundamental bedrock of IT and the lack of security application during the engineering SDLC is a consistent failure then I don’t know what to tell you except maybe to gain more experience in software engineering and read more about data breaches.

Cyber security: https://en.m.wikipedia.org/wiki/Computer_security

But for arguments sake let’s just stick to network, application, mobile and IOT security.

- Why is MFA not on by default?

- How many devs have prod credentials published to Github.com right now?

- How many unsecured IOT devices are there?

- Why is email security such a dumpster fire?

- How many companies have an SSDLC?

- How many companies require separation of duties and approvals before a dev publishes another AWS bucket or some other unprotected data store to the web?

Go ahead and blame business but they don’t know jack about software engineering. We determine which corners to cut as opposed to stating to business that security is just a part of doing engineering. We’re the one’s who decide and thus it’s our responsibility despite denials of people like yourself.


More like it's not concrete engineers job to find concrete mix that fits every possible usage case.


A civil engineer should be able to do that (or decline to do it if they can not) and that's why it is their job and not the geologist's job to build the bridge. Research != engineering.


It's not the job of the nuclear weapon researcher to solve global war, it's their job to build the nuclear weapon.

Abstracting away the ethical issues of your (elite) employment is a personal choice that is anything but objectively neutral.


> Abstracting away the ethical issues of your (elite) employment is a personal choice that is anything but objectively neutral.

Does this apply to doctors taking care a murderer, rapist, [insert felony here]?

Should this apply to the doctor of Yann LeCun?

Should these "personal choices" be valid, allowing the doctors to judge YOU, negating a cure?

You can extend 'doctor' to other professions too, like journalists.

PS: all those professions require to be objectively neutral.


Different professions can adopt different standard ethical frameworks. Healthcare is relatively unique in the code of ethics it chooses.


Nitpicking: If they build the weapons, they aren’t the researchers but the engineers.


I agree with everything you say but it might be worth considering the "social" impact and aspects of AI models that are touted to change the world and are being rapidly adapted to our everyday lives with or without our consent. I can understand that engineers are not trained to consider those aspects but leaving them entirely to their non-creator or external bureaucrats might also not be the best strategy since they hardly understand the systems as well as engineers do. I am not sure what is the best strategy for this co-existence.


If Bagger 288 tries to drive across a footbridge that's not really the engineer's fault.

Although while everyone intuitively understands the load bearing capacity of a footbridge, not many understand the capabilities of ML models. So perhaps better advertising the capabilities of AI would help inform decisions.

On the other hand, nothing an engineer or scientist can do will stop the Chinese government from using their technology to predict and suppress dissidents and minorities.


> On the other hand, nothing an engineer or scientist can do will stop the Chinese government

OT, but shall we stop throwing casual references to China as the example of everything bad that can happen?


I'm not sure if we can or can't, but in this case this is not just general "china bad", but an actually pretty relevant example because of the CCP using computer vision advancements for the purpose of rounding up Uyghurs.


Here we're talking about AI biases though. Such as the possibility of an AI giving a bad score to a black loan applicant because of implicit biases in the training.

On the other hand, China is in a conflict with the Uighur population of Xinjiang. As far as I understand it, there are elements both of cultural clash (China doesn't like religion in general, and the Uighurs are muslim) and the Uighurs' reaction to an influx of ethnically chinese population in the region. Anyway, Uighurs engaged in terrorism: this BBC article seems pretty balanced and lists a number of terror attacks by Uighurs as well as the repressive actions by China: https://www.bbc.com/news/world-asia-china-26414014

In this context, I think that China might be using AI not to "round up" Uighurs, but as an intelligence measure to prevent more terror attacks. Similar to how the US intelligence is (I have no doubts about it) profiling muslims and middle eastern immigrants- not because it has anything against those groups per se, but because it has reasons to believe terrorists might hide in their ranks.


All possibly true I guess, or at least I'm not an expert and won't debate you on this. Still, the discussion above was more about the potential uses of invented tech, and in that regard it doesn't matter if it's only rumored to be used wrong and the rumors have something like 10% chance of being true. The ethical dilemma is the same. It's there from the point that you can imagine it realistically happening.


> in that regard it doesn't matter if it's only rumored to be used wrong and the rumors have something like 10% chance of being true

I agree, but then it's telling how in an abstract argument about the potential misuse of technology seems natural to throw in an offhand accusation against a specific country. I am pretty worried by how quickly China has become the new boogeyman- everyone thinks it's perfectly reasonable to display anger towards a huge country that only a few years ago was seen, despite its obvious issues and shortcomings, as successful and dynamic.

And I remember how it started: when a mainstream, trusted news outlet reported about the existence of "spy chips" in hardware sold by Chinese companies to the US "according to extensive interviews with government and corporate sources". Which later turned out to be fake news. It gives pause for thought.

https://www.bloomberg.com/news/features/2018-10-04/the-big-h...


> China might be using AI not to "round up" Uighurs, but as an intelligence measure to prevent more terror attacks.

That is exactly what I claimed they were potentially using it for.

I acknowledge that I am very much biased against China as I live in a country within it's sphere of influence and have friends in HK.

But in any discussion of using AI for unethical purposes, China is the ur-example, as Nazi Germany is to facism: an authoritarian government with a history of tech surveillance, censorship and media control, and minority oppression, and the tech to back it up.

If they don't like that, maybe they should stop using technology to oppress their own citizens.


Yes LeCun is right to focus on his tasks as a researcher in research labs, papers, conferences. However, he is also a leader of the field when speaking to the outside world, which is what happens on Twitter in heated arguments.

I think he has some responsibility to at least acknowledge the complexity of this issue in such cases. Not speaking to the public is also a option if he doesn't like the leader role, so him deleting twitter is a completely ok thing in my eyes.


All sides agree there is an issue around fair implementations of ML algos in the real world. There are two ways this can happen:

1) The model uses biased data when it could have used unbiased data.

2) The model uses biased data when no unbiased data exists and is incapable of correcting for these discrepancies.

The first case is most clearly and engineering/implementation issue, the second is obviously not. Biased data is a known failure case of ML, its the responsibility of researchers to design for it.


> It's not the job of the AI researcher to solve "social biases" in every field, it's their job to build the AI.

"'Once the rockets are up, who cares where they come down? That's not my department' says Wernher von Braun"


Tim Berners-lee should have stopped to consider the potential for dark web criminality before he developed the internet.


If one is working on AI dealing with facial recognition and is oblivious to the potential for bias, and the unethical applications of that technology, at this point in the game, I can only assume it's willful.


> larger problem

For a scientist doing a ML system to reconstruct pixelated faces, trained with white faces, why is he/she responsible for "insert larger problem" outside of her/his field?

Do he/she has to also care about the brutal Tantalum Wars because there are some in the electronics they use?

As far as I can see, ML is pretty new, and there is a lot of room for improvement. And I think people need to stop thinking the entire world is racist, or doesn't care or don't want to improve things. Changes take time, and won't happen today, or tomorrow or the next decade.

There is a constant, furious need to point fingers, followed by fear of telling or not telling just the right words on the right order to the right audience. This is not how we are going to solve anything.


>For a scientist doing a ML system to reconstruct pixelated faces, trained with white faces, why is he/she responsible for "insert larger problem" outside of her/his field?

Because we're all responsible for how the tools we built are used and what they enable, and how they affect society at large. That's what ethics is about, something which seems to be absent in the education of the modern citizen and in particular engineer or scientist, who is supposed to come to work, program things and not think too much about the impact their products have on the world.


You know I'm fine with that attitude if it's combined with actual humility. Often as not, however, these high profile guys are shooting off on Twitter about every issue under the sun. But when it comes to racism, suddenly they're just narrow technicians and it's nothing to do with them.


I don't know if it is about racism. I think it's normal to become defensive when accused. If I say "your ML model is garbage" I am sure they will (maybe, if I am high-profile too) provide an explanation on why or why not.


This long twitter thread about racism isn't about racism?


Sorry, I expressed it wrong. I wanted to say that when someone is accused of something, including of being a racist or something related to racism, it's normal to be defensive about it and take that posture.

I don't think the researcher was being defensive just because it was about racism.


I think the key issues are primarily related to race.

Many see these systems as a way to implement stop-and-frisk (quite racist outcomes) and worse across the country using AI/ML as cover. The higher error rate amongst darker skin people gives LE a new excuse to harass innocent historically disenfranchised people.

I expect this tech will be widely used long before the accuracy problem that affects ~60% of the global population is fixed.


Reconstructing pixelated white faces is a lame research goal. If they think it would work on all faces, make them show that: it's science and it's their job to demonstrate that their approach works. Why bring "larger problems" into this when the problem is a "small problem", they built a system and it doesn't quite work but they published it anyway.

By the way, ML is not nearly as new as you seem to think, and given the amount of resources poured into it by FAANG-type companies recently, even five years is a lot of time for ML nowadays.


> Reconstructing pixelated white faces is a lame research goal

I'm sorry but you can't know this. Maybe it was the real objective, or a first step for something bigger, or a drunk Saturday night project.

In any case is research, and is very valid. Unless you are the chief at their lab, I think you don't get to tell what to do or not to do and/or the scope of their job. Anything else is a conjecture.

> By the way, ML is not nearly as new as you seem to think

Well, the results are there, together with the polemic it generates TODAY about white and black faces. Tell me if ML and its adoption, generally accepted or not, is mature yet.


All input is biased. It is not our job as scientists to remove biases from a data set. It is our job to reduce bias and characterize bias, but it's impossible to have a data set that has zero bias. You cannot sample the whole universe.


Sure, but using the Duke paper example to illustrate that is clearly dumb. The toy AI network was designed that way. Yann was right to call it out.


This issue reminds me of the IQ testing and measurement problem of regression bias, which likewise arises from statistical artifacts from decades past (when tests were administered to much more narrow and self-selecting groups of subjects than today).

The means calculated using much less diverse series of historical cohorts are still being fed back in to present IQ calculations, which thus retain a persistent corresponding "echo" of the biases in favor of those early test results.

The echo is gradually fading, but some say a clean reboot of the baselines is the right way to resolve historical sample biases which still skew IQ testing away from an accurate modeling of the diversity in present test populations.


Why does diversity matter here? They are just feeding the scores into the model, right?


The whole thing is an illustration of what a mess Twitter is. Sorting out what LeCun's overall message is seems next to impossible given the patchwork of ten line messages from who knows who.

But I just wanted to say that in the instance of a neural net that grants bail or not, there's an issue beyond either biased data or the neural reproducing previously biased opinions.

The modern notion of fairness implies that the individual be judged based on their personal merits rather than things roughly correlated with their surface characteristics. Being black is correlated with being poor and being poor is correlated with being a criminal and various other bad behaviors. But that doesn't mean it's fair to punish a given individual, who is only responsible for themselves, for such surface characteristic.

Which is to say that if an AI crunches the numbers in a objective fashion with the aim to make decisions based on various correlations, that can fundamentally problematic regardless of the bias of the original data or people.


[flagged]


> Which is to say,...

Since there is no point of comparison, this is not a valid statistical conclusion and is just a racist framing.


What do you mean? "If blacks were criminal at the rate predicted by their poverty rate, they would be much, much less criminal than they actually are" is a pure empirical observation.

The logical point of comparison would be the nonblack population, but the conclusion is just as true for the entire population.


This is because it's not just a poverty problem, but also a class problem, with a healthy dose of racism mixed in to keep the status quo.

Lower class people are excluded from many (and sometime most) parts of society through harassment, violence, ostracism, or even through legal means (India's untouchables are a good example, or Japan's burakumin). As a result, they stop identifying with anyone outside of their class, and not only don't care what violence and misfortune befalls the other class, but are much more willing to contribute to it, given a good enough opportunity.

When you blend poverty and classism, you have the perfect mix for despair, because 99% of people born to this state have literally no hope of ever escaping it, which makes the criminal route a MUCH more attractive option, since virtually every legal means is barred to you by your class, and the few that remain are such long shots that you'll probably waste your whole life chasing them if you did (unless you get lucky).

I lived in the USA for many years and saw it quite clearly: the rage, the despair, the hatred. And I really can't say I blame the downtrodden for their reaction to this kind of systemic injustice, especially since they are treated this way the moment someone sees the color of their skin (the Irish and Italians of old could at least LOOK like the upper class with enough effort). How can someone who is identified as low class on-sight be anything but angry and frustrated?


But why is criminality comparatively higher and lower between different racial groups of the same class?

Edit: I know HN guidelines say not to complain about moderation (I've been here for 13 years now), but it's ridiculous a very simple, on-topic question should be downmodded into oblivion on a site whose purpose is discussion. While I have karma to burn and will happily keep participating, I'm concerned that this trend will have a chilling effect on people who do not.


Economic class and social class are different things, although one can affect the other to a degree.

You observe similar criminality trends in African immigrants to Japan compared to the other immigrant groups (including Europeans and Americans), for example. The only exception is the Brazilians, who were brought in as low class workers, have a MUCH higher crime rate, have trouble renting outside of low-income areas, can't get high class jobs, and are generally treated like criminals on-sight.

Germany has a similar problem with their Turkish population.

And it's not restricted to skin color, either. Low class culture takes generations to clean out. I've had a number of dealings with shady descendants of American-Italian immigrants, many in high level financial market jobs. This is fallout from the exclusion of Italian immigrants from the higher classes in the early 20th century, which led to an undercurrent of criminality in their culture (not to mention their good positioning during alcohol prohibition, gambling prohibition, and later drug prohibition).

Cultural currents take multiple generations to steer, which leaves one with many chickens coming home to roost after treating a class of people like shit for so long.


> This is fallout from the exclusion of Italian immigrants from the higher classes in the early 20th century

I am Italian and I can assure you that those who remained in Italy are not different. Large part of the Italian immigration to the US comes from the south, where the different types of mafia are only the official and extreme degree of a widespread culture of corruption, shady dealings and reciprocal abuse.


Social status/class and your economic status heavily overlap, but are not the same. The question gets partly answered just by specifying economic status/class vs just class.


Welcome to modern social justice. Calling a non-racist a racist apparently fixes racism.


The court example was good. Trying to make a list of possible domains where social bias is a factor:

- court risk assessments

- loan risk assessment

- job application pre-screening

- law enforcement face recognition

- medical scan interpretation

...

So, in essence: justice, banking, jobs, policing and health care. Anything else? Seems like social bias can only affect a small proportion of the ML application domains.


The classic (multi-decade problem repeated again, and again, and again...) is face detection. Face detection models continue to get trained by primarily white developers/researches who continue to use primarily white people as their training set. And then they're shocked when their model performs terribly for everyone who isn't of European decent.

This case is so pervasive that it gets taught in basically every university that teaches ML - and yet those students go on to repeat the exact problem they were educated against.


> Face detection models continue to get trained by primarily white developers/researches who continue to use primarily white people as their training set.

I guess you didn't watch Asian developments in face recognition closely in the last few years.


Let’s pause all ML research until we have 50% black representation!


Does the face recognition tech in China have big problems with european faces?



It's even worse than that.

Assume you want to train an AI to recognize shops or buildings for example for a Google car.

Well if you do it in the US with skyscraper, in Europe, in Africa, Middle-East or Asia, you will get completely different results and biases.

Also, I don't see how anyone has the resources to compensate such a social bias, unless they plan to do a shooting trip in hard to access locations and try to justify such as "Seriously officer, the reason I take all those photos is to make sure my machine learning model is unbiaised so I don't classify shops in your country as shacks."

And if we forget human activity, even looking at nature, the fauna and flora are different between areas their color, how sparse they are, etc.

Last example, applying sparsity to human activity, what is actually a town in some country might be classified as a settlement for example due to bias.


In this case, ironically, it seems that an AI could incorrectly depixelate faces of suspects caught by surveillance cameras to white instead of black. Which would lead to a bias against white people.


Please take a look at the PULSE model card here:

https://thegradient.pub/content/images/2020/06/image.png

PULSE is published as an art project, not suitable for face recognition or upscaling. It can only generate 'imaginary faces'.

They didn't even train the GAN they were using, it was borrowed from another paper. They probably used StyleGAN because it was a nice high quality generator and they invented a novel way to use GANs so they needed a toy model to showcase their algo.


Agreed. I was only pointing out that even bias is a matter of bias: if white faces had been depixelated to black faces instead of the other way around, the authors could have still been accused of a racist bias (because of the hypothetical scenario I've described above).


This is such a false binary. Of course we must correct for historical bias. Of course we must also proactively look for biases.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: