I'm in the field - though not as prominent as Yann (who has been very nice and h...

dcolkitt · on June 23, 2020

> He works for Facebook. He's paid with Facebook money. So why draw this imaginary line between research and production? He is paid to do research that will go into production.

This is a silly standard to uphold. The sizable bulk of American academic researchers are at least partially funded by grants made from the US federal budget.

If you were to enforce your standards consistently, then all of those researchers would be held responsible for any eventual usage of their research by the US federal government.

I really doubt you apply the same standard. So, the criticism mostly seems to be an isolated demand for rigor. You're holding Facebook Research to a different standard than the average university researcher funded by a federal grant.

ncallaway · on June 23, 2020

This seems almost purposefully disingenuous to me.

Yann LeCun isn't receiving a partial research grant from Facebook. He's literally an employee of Facebook. His job title is "VP & Chief AI Scientist" (at least according to LinkedIn).

There's an obvious and clear distinction between an employee and a research grant, and this feels like it's almost wilfully obtuse.

nlpprof · on June 23, 2020

Did you read what I wrote?

I don't think his argument is true. (That is, I do think researchers should keep bias in mind when developing machine learning projects.) (Regardless of their funding sources.)

Because of his employment, this argument is a particularly silly one for him to make.

tinyhouse · on June 23, 2020

Don't have a lot of time to respond now, but will try to do it later. Just a quick note. I agree his comment about engineers need to worry more about bias than researchers is strange. But in my opinion it wasn't the focus of what he was tying to say.

I used "woman researcher" since it was important for the context as people accused him of mansplaining.

vlthr · on June 23, 2020

I agree with all of your points about the diffusion of responsibility that is common in ML, though I think you may not be sensitive enough to the harmful framing being created by the "anti-bias" side.

The original locus of the debate was how the recent face-depixelation paper turned out to depixelate pictures of black faces into ones with white features. That discovery is an interesting and useful showcase for talking about how ML can demonstrate unexpected racial bias, and it should be talked about.

As often happens, the nuances of what exactly this discovery means and what we can learn from it quickly got simplified away. Just hours later, the paper was being showcased as a prime example of unethical and racist research. When LeCun originally commented on this, I took his point to be pretty simple: that for an algorithm trained to depixelate faces, it's no surprise that it fills in the blank with white features because that's just what the FlickFaceHQ dataset looks like. If you had trained it on a majority-black dataset, we would expect the inverse.

That in no way dismisses all of the real concerns people have (and should have!) about bias in ML. But many critics of this paper seem far too willing to catastrophize about how irresponsible and unethical this paper is. LeCun's original point was (as I understand it) that this criticism goes overboard given that the training dataset is an obvious culprit for the observed behavior.

Following his original comment, he has been met with some extremely uncharitable responses. The most circulated example is this tweet (https://twitter.com/timnitGebru/status/1274809417653866496?s...) where a bias-in-ml researcher calls him out without as much as a mention of why he is wrong, or even what he is wrong about. LeCun responds with a 17-tweet thread clarifying his stance, and her response is to claim that educating him is not worth her time (https://twitter.com/timnitGebru/status/1275191341455048704?s...).

The overwhelming attitude there and elsewhere is in support of the attacker. Not of the attacker's arguments - they were never presented - but of the symbolic identity she takes on as the anti-racist fighting the racist old elite.

I apologize if my frustration with their behavior shines through, but it really pains me to see this identity-driven mob mentality take hold in our community. Fixing problems requires talking about them and understanding them, and this really isn't it.

sinity · on June 24, 2020

I think this is relevant: https://twitter.com/AnimaAnandkumar/status/12711371765294161...

Nvidia AI researcher calling out OpenAI's GPT-2 over how GPT-2 is horrible because it's trained on Reddit (except it includes contents of submissions, and I'm not sure if there's no data except Reddit)

Reddit is supposedly not a good source of data to train NLP models because it's... racist? sexist? Like it's even rightist in general...

Anyway; the table looks horrific - why would they include these results? Oh, turns out paper was on bias: https://arxiv.org/pdf/1909.01326.pdf

Anyway; one can toy with GPT-2 large (paper is on medium, so it might be different) at talktotransformer.com

"The woman worked as a ": 2x receptionist, teacher's aide, waitress. Man: waiter, fitness instructor, spot worker, (construction?) engineer. Black man: farm hand, carpenter, carpet installer(?), technician. White man: assistant architect, [carpenter but became a shoemaker], general in the army, blacksmith.

I didn't read the paper, I admit, maybe I'm missing something here. But these tweets look like... person responsible should be fired.

thebladerunner · on June 30, 2020

Very well articulated, thank you!

smsm42 · on June 23, 2020

So, your argument is that you disagree with data being the root of the problem by arguing that data "shapes research directions in a really fundamental way", research is "empirical" (i.e. based on data) and his research can't be isolated from data it'd be used on in production?

Looks to me that you're argumentatively agreeing with Yann.

joshuamorton · on June 23, 2020

Not really, Yann's original claim (which he sort of kind of partially walked back) was that data is the only source of bias [0][1]. He walked that back somewhat to claim that he was being very particular in this case[2], which is perhaps true, but still harmful. The right thing to do when you make a mistake is apologize. Not double down and badly re-explain what other experts have been telling you back at them.

So then Yann notes that generic models don't have bias[3]. This is, probably, true. I'd be surprised if on the whole, "CNNs" encoded racial bias. But the specific networks we use, say ResNet, which are optimized to perform well on biased datasets, may themselves encode bias in the model architecture[4]. That is, the models that perform best on a biased dataset may themselves be architecturally biased. In fact, we'd sort of expect it.

And that all ignores one of the major issues which Yann entirely skips, but which Timnit covers in some of her work: training on data, even "representative data" encodes the biases that are present in the world today.

You see this come up often with questions about tools like "crime predictors based on faces". In that context it's blatantly obvious that no, what the model learns will not be how criminal someone is, but how they are treated by the justice system today. Those two things might be somewhat correlated, but they're not causally related, and so trying to predict one from the other is a fool's errand and a dangerous fool's errand since the model will serve to encode existing biases behind a facade of legitimacy.

Yann doesn't ever respond to that criticism, seemingly because he hasn't taken the time to actually look at the research in this area.

So insofar as data is the root of the problem, yes. Insofar as the solution is to just use more representative data in the same systems, no. That doesn't fix things. You have to go further and use different systems or even ask different questions (or rule out certain questions as too fraught with problems to be able to ask).

[0]: https://twitter.com/ylecun/status/1203211859366576128

[1]: https://twitter.com/ylecun/status/1274782757907030016

[2]: https://twitter.com/ylecun/status/1275162732166361088

[3]: https://twitter.com/ylecun/status/1275167319157870592

[4]: https://twitter.com/hardmaru/status/1275214381509300224. This actually goes a bit further, suggesting that as a leader in the field one has a responsibility to encourage ethics as part of the decision making process in how/what we research, but let's leave that aside.

zozbot234 · on June 23, 2020

> Yann doesn't ever respond to that criticism, seemingly because he hasn't taken the time to actually look at the research in this area.

No, that's still a problem with data in a broader sense. The issue is that "how X will be treated by the justice system" is not modeled by the data, so there's no possible pathway for a ML model to become aware of it as something separate from "crime". People who ignore this are expecting ML to do things it cannot possibly do - and that's not even a fact about "bias"; it's a fact about the fundamentals of any data-based inquiry whatsoever.

joshuamorton · on June 23, 2020

I hope you read to the end of my post where I address that:

> So insofar as data is the root of the problem, yes. Insofar as the solution is to just use more representative data in the same systems, no. That doesn't fix things.

Ultimately Yann's proposals are still to use "better data" whereas all the ethics people are (and have been) screaming no, you can't use better data because it doesn't exist. He doesn't acknowledge that.

And the hairs Yann is trying to split here are ultimately irrelevant[1] and probably harmful[2]. And as someone with a large platform, addressing those issues in a straightforward way is far, far superior to trying to split those hairs over twitter.

From a meta perspective, his tweetstorm didn't add anything to the conversation that Dr. Gebru and her collaborators aren't already aware of. Nor did Yann's overall take away help to inform the average twitter user on these issues. In fact, they're more likely to take away the opposite conclusion: that with good enough data we can ask these questions in a fair way.

But as you rightly conclude there are flaws in any data based inquiry. Yann doesn't concede that.

[1]: https://twitter.com/isbellHFh/status/1275184863159685121

[2]: https://twitter.com/hardmaru/status/1275088134238162944

zozbot234 · on June 23, 2020

I'm not sure that Yann was trying to split hairs there. He was reasoning about the issue from first principles (e.g. the problem-domain vs. architecture vs. data distinction) and then failing to carry his reasoning thru to the reasonable conclusion that you mention re: the inherent flaws of any data-based modeling. Criticizing his take wrt. these issues is constructive; being careless about what his actual views are is not.

smsm42 · on June 26, 2020

> Those two things might be somewhat correlated, but they're not causally related,

That's kinda bold claim. Are you arguing that current justice system just picks up people at random, and assigns them crimes at random, with no correlation with their actions? I mean, not some bias towards here or there, but no causal relationship between person's actions and justice system's reactions at all? That's... bold.

But if this is the case, then the whole discussion is pointless. If justice system is not related to people's action then there's no possible improvement to it, since if the actions are not present as an input, then no change in the models would change anything - you can change how exactly random it is, but you can't change the basic fact it is random. What's the point of discussing any change at all?

> Insofar as the solution is to just use more representative data in the same systems, no.

If by "same systems" you mean systems pre-trained on biased data, then of course adding representative data won't fix them. And of course if the choice of model is done on the basis of biased data then this choice propagates the bias of the data, so it should be accounted for. But I still don't see where the disagreement is, and yet less basis for claims like "harmful".

joshuamorton · on June 26, 2020

> I mean, not some bias towards here or there, but no causal relationship between person's actions and justice system's reactions at all?

It depends on what you mean by causal. Does criminal behavior cause interactions with the justice system? Yes. But not engaging in criminal behavior doesn't prevent interactions with the justice system (for specific vulnerable subpopulations). So would you say that ReLU shows a causal relationship between criminality on the X axis and how the justice system treats you on the Y? I don't think I would.

In some sense btw this is what Timnit's "Gender Shades" paper looks at, which is that even if a classifier is "good" in general, it can be terrible on specific subpopulations. Similarly, even if there is a causal relationship across the entire population, that relationship may not be causal on specific subpopulations.

And of course, that ignores broader problems around our justice system being constructed to cause recidivism in certain cases. In such situations, interactions with the justice system cause criminal behavior later on. So clearly, in general since Y is causal on X, X can't be causal on Y.

> But if this is the case, then the whole discussion is pointless.

No! Because people trust computers more than they trust people. Computers have a veil of legitimacy and impartiality that people do not. (no really, there's a few studies that show that people will trust machines more than people in similar circumstances). Adding legitimacy through a fake impartiality to a broken system is bad because it raises the activation energy to reform the system.

At it's core, that's probably the biggest issue that Yann is missing. Even in cases where an AI model can perfectly recreate the existing biases we have in society and do no worse, we've still made things worse by further entrenching those biases.

> But I still don't see where the disagreement is, and yet less basis for claims like "harmful".

So I think an important precursor question here is if you believe the pursuit of truth for truth's sake is worthwhile, even when you have reason to believe the pursuit of truth will cause net harm? Imagine you have a magic 8 ball that when given a question about the universe will tell you whether or not your pursuit of the answer to that question will ultimately be good or bad (in your ethical framework, it's a very fancy 8-ball). It doesn't tell you what the answer is, or even if you'll be able to find the answer, only what the impact of your epistemological endeavor will be on the wider world.

If, given a negative outcome, you'd still pursue the question, I don't think we have common ground here. But assuming you don't agree that knowledge is valuable for knowledges sake, and instead that it's only valuable for the good it has on society, we have common ground.

In that case, you have an ethical obligation to consider how your research may be used. If you build a model, even an impossibly fair one, to do something, and it's put in the hands of biased users, that will harm people. This is very similar to the common research ethics question of asking how your research will be used. But applied ML (even research-y applied ML) is in a weird space because applied ML is all about, at a meta level, taking observations about the world, training a box on those observations, and then sticking that box into the world where it will now influence things, so you have effects on both ends, how the box is trained and how the box will influence.

Like, in many contexts "representative" or "fair" is contextual. Or at least the tradeoffs between cost and representativity make it contextual. Yann rightly notes that the same model trained on "representative" datasets in Senegal and the US will behave differently. So how do you define "representative"? How do you, as a researcher, even know that the model architecture you come up with that performs well on a representative US dataset will perform equally well on a representative Senegalese dataset (remember how we agreed that model architecture itself could encode certain biases)? Will it be fair if you use the pretrained US model but tune it on Senegalese data, or will Senegalese users need to retrain from scratch, while European users could tune?

Data engineers will of course need to make the decisions on a per-case basis, but they're less familiar with the model and its peculiarities than the model architects are, so how can the data engineers hope to make the right decisions without guidance? This is where "Model Cards for Model Reporting" comes in. And in some cases this goes further to "well we can't really see ethical uses for this tool, so we'll limit research in this direction" which can be seen in some circles of the CV community at the moment, especially w.r.t. facial recognition and the unavoidable issues of police, state, and discriminatory uses that will continue to embed existing societal biases.

And as a semi aside statements like this[0] read as incredibly condescending, which doesn't help.

[0]: https://twitter.com/ylecun/status/1275162732166361088

smsm42 · on June 26, 2020

> It depends on what you mean by causal.

I mean P(being in justice system|being actual criminal) > P(being in justice system), and substrantially so. Moreover, P(being criminal|being in justice system) > P(being criminal). In plain words, if you sit in jail, you're substantially more like to be an actual criminal than a random person on the street, and if you're a criminal, you're substantially more like to end up in jail than a random person on the street. That's what I see as causal relationship. Of course it's not binary - not every criminal ends up in jail, and innocent people do. But the system is very substantially biased towards punishing criminals, thus establishing causal relationship.

There are some caveats to this, as our justice system defines some things that definitely should not be a crime (like consuming substances the goverment does not approve of for random reasons) as a crime. But I think the above conslusion still holds regardless of this, even though becoming somewhat weaker if you not call such people criminals. It is, of course, dependant on societal norms, but no data models would change those.

> If you build a model, even an impossibly fair one, to do something, and it's put in the hands of biased users, that will harm people.

That is certainly possible. But if you build a shovel, somebody might use it to hit other person over the head. You can't prevent misuse of any technology. According to the Bible, the first murder happened in the first generation of people that were born - and while not many believe in this as literal truth now, there's a valid point here. People are inherently capable of evil, and denying technology won't help it. You can't make the word better by suppressing all research that can be abused (i.e. all research at all). You can mitigate potential abuse, of course, but I don't think "never use models because they could be biased and abused" is a good answer. "Know how models can be biased and explicitly account for that in the decisions" would be better one.

> his[0] read as incredibly condescending, which doesn't help.

Didn't read condescending to me. Maybe I do miss some context but it looks like he's saying he's not making generic claim but only a specific claim about a very specific narrow situation. Mixing these two is all too common nowdays - somebody claims "X can be Y if conditions A and B are true" and people start reading it as "all X are always Y" and make far-reaching conclusions from it and jump into personal shaming campaign.