So, your argument is that you disagree with data being the root of the problem b...

joshuamorton · on June 23, 2020

Not really, Yann's original claim (which he sort of kind of partially walked back) was that data is the only source of bias [0][1]. He walked that back somewhat to claim that he was being very particular in this case[2], which is perhaps true, but still harmful. The right thing to do when you make a mistake is apologize. Not double down and badly re-explain what other experts have been telling you back at them.

So then Yann notes that generic models don't have bias[3]. This is, probably, true. I'd be surprised if on the whole, "CNNs" encoded racial bias. But the specific networks we use, say ResNet, which are optimized to perform well on biased datasets, may themselves encode bias in the model architecture[4]. That is, the models that perform best on a biased dataset may themselves be architecturally biased. In fact, we'd sort of expect it.

And that all ignores one of the major issues which Yann entirely skips, but which Timnit covers in some of her work: training on data, even "representative data" encodes the biases that are present in the world today.

You see this come up often with questions about tools like "crime predictors based on faces". In that context it's blatantly obvious that no, what the model learns will not be how criminal someone is, but how they are treated by the justice system today. Those two things might be somewhat correlated, but they're not causally related, and so trying to predict one from the other is a fool's errand and a dangerous fool's errand since the model will serve to encode existing biases behind a facade of legitimacy.

Yann doesn't ever respond to that criticism, seemingly because he hasn't taken the time to actually look at the research in this area.

So insofar as data is the root of the problem, yes. Insofar as the solution is to just use more representative data in the same systems, no. That doesn't fix things. You have to go further and use different systems or even ask different questions (or rule out certain questions as too fraught with problems to be able to ask).

[0]: https://twitter.com/ylecun/status/1203211859366576128

[1]: https://twitter.com/ylecun/status/1274782757907030016

[2]: https://twitter.com/ylecun/status/1275162732166361088

[3]: https://twitter.com/ylecun/status/1275167319157870592

[4]: https://twitter.com/hardmaru/status/1275214381509300224. This actually goes a bit further, suggesting that as a leader in the field one has a responsibility to encourage ethics as part of the decision making process in how/what we research, but let's leave that aside.

zozbot234 · on June 23, 2020

> Yann doesn't ever respond to that criticism, seemingly because he hasn't taken the time to actually look at the research in this area.

No, that's still a problem with data in a broader sense. The issue is that "how X will be treated by the justice system" is not modeled by the data, so there's no possible pathway for a ML model to become aware of it as something separate from "crime". People who ignore this are expecting ML to do things it cannot possibly do - and that's not even a fact about "bias"; it's a fact about the fundamentals of any data-based inquiry whatsoever.

joshuamorton · on June 23, 2020

I hope you read to the end of my post where I address that:

> So insofar as data is the root of the problem, yes. Insofar as the solution is to just use more representative data in the same systems, no. That doesn't fix things.

Ultimately Yann's proposals are still to use "better data" whereas all the ethics people are (and have been) screaming no, you can't use better data because it doesn't exist. He doesn't acknowledge that.

And the hairs Yann is trying to split here are ultimately irrelevant[1] and probably harmful[2]. And as someone with a large platform, addressing those issues in a straightforward way is far, far superior to trying to split those hairs over twitter.

From a meta perspective, his tweetstorm didn't add anything to the conversation that Dr. Gebru and her collaborators aren't already aware of. Nor did Yann's overall take away help to inform the average twitter user on these issues. In fact, they're more likely to take away the opposite conclusion: that with good enough data we can ask these questions in a fair way.

But as you rightly conclude there are flaws in any data based inquiry. Yann doesn't concede that.

[1]: https://twitter.com/isbellHFh/status/1275184863159685121

[2]: https://twitter.com/hardmaru/status/1275088134238162944

zozbot234 · on June 23, 2020

I'm not sure that Yann was trying to split hairs there. He was reasoning about the issue from first principles (e.g. the problem-domain vs. architecture vs. data distinction) and then failing to carry his reasoning thru to the reasonable conclusion that you mention re: the inherent flaws of any data-based modeling. Criticizing his take wrt. these issues is constructive; being careless about what his actual views are is not.

smsm42 · on June 26, 2020

> Those two things might be somewhat correlated, but they're not causally related,

That's kinda bold claim. Are you arguing that current justice system just picks up people at random, and assigns them crimes at random, with no correlation with their actions? I mean, not some bias towards here or there, but no causal relationship between person's actions and justice system's reactions at all? That's... bold.

But if this is the case, then the whole discussion is pointless. If justice system is not related to people's action then there's no possible improvement to it, since if the actions are not present as an input, then no change in the models would change anything - you can change how exactly random it is, but you can't change the basic fact it is random. What's the point of discussing any change at all?

> Insofar as the solution is to just use more representative data in the same systems, no.

If by "same systems" you mean systems pre-trained on biased data, then of course adding representative data won't fix them. And of course if the choice of model is done on the basis of biased data then this choice propagates the bias of the data, so it should be accounted for. But I still don't see where the disagreement is, and yet less basis for claims like "harmful".

joshuamorton · on June 26, 2020

> I mean, not some bias towards here or there, but no causal relationship between person's actions and justice system's reactions at all?

It depends on what you mean by causal. Does criminal behavior cause interactions with the justice system? Yes. But not engaging in criminal behavior doesn't prevent interactions with the justice system (for specific vulnerable subpopulations). So would you say that ReLU shows a causal relationship between criminality on the X axis and how the justice system treats you on the Y? I don't think I would.

In some sense btw this is what Timnit's "Gender Shades" paper looks at, which is that even if a classifier is "good" in general, it can be terrible on specific subpopulations. Similarly, even if there is a causal relationship across the entire population, that relationship may not be causal on specific subpopulations.

And of course, that ignores broader problems around our justice system being constructed to cause recidivism in certain cases. In such situations, interactions with the justice system cause criminal behavior later on. So clearly, in general since Y is causal on X, X can't be causal on Y.

> But if this is the case, then the whole discussion is pointless.

No! Because people trust computers more than they trust people. Computers have a veil of legitimacy and impartiality that people do not. (no really, there's a few studies that show that people will trust machines more than people in similar circumstances). Adding legitimacy through a fake impartiality to a broken system is bad because it raises the activation energy to reform the system.

At it's core, that's probably the biggest issue that Yann is missing. Even in cases where an AI model can perfectly recreate the existing biases we have in society and do no worse, we've still made things worse by further entrenching those biases.

> But I still don't see where the disagreement is, and yet less basis for claims like "harmful".

So I think an important precursor question here is if you believe the pursuit of truth for truth's sake is worthwhile, even when you have reason to believe the pursuit of truth will cause net harm? Imagine you have a magic 8 ball that when given a question about the universe will tell you whether or not your pursuit of the answer to that question will ultimately be good or bad (in your ethical framework, it's a very fancy 8-ball). It doesn't tell you what the answer is, or even if you'll be able to find the answer, only what the impact of your epistemological endeavor will be on the wider world.

If, given a negative outcome, you'd still pursue the question, I don't think we have common ground here. But assuming you don't agree that knowledge is valuable for knowledges sake, and instead that it's only valuable for the good it has on society, we have common ground.

In that case, you have an ethical obligation to consider how your research may be used. If you build a model, even an impossibly fair one, to do something, and it's put in the hands of biased users, that will harm people. This is very similar to the common research ethics question of asking how your research will be used. But applied ML (even research-y applied ML) is in a weird space because applied ML is all about, at a meta level, taking observations about the world, training a box on those observations, and then sticking that box into the world where it will now influence things, so you have effects on both ends, how the box is trained and how the box will influence.

Like, in many contexts "representative" or "fair" is contextual. Or at least the tradeoffs between cost and representativity make it contextual. Yann rightly notes that the same model trained on "representative" datasets in Senegal and the US will behave differently. So how do you define "representative"? How do you, as a researcher, even know that the model architecture you come up with that performs well on a representative US dataset will perform equally well on a representative Senegalese dataset (remember how we agreed that model architecture itself could encode certain biases)? Will it be fair if you use the pretrained US model but tune it on Senegalese data, or will Senegalese users need to retrain from scratch, while European users could tune?

Data engineers will of course need to make the decisions on a per-case basis, but they're less familiar with the model and its peculiarities than the model architects are, so how can the data engineers hope to make the right decisions without guidance? This is where "Model Cards for Model Reporting" comes in. And in some cases this goes further to "well we can't really see ethical uses for this tool, so we'll limit research in this direction" which can be seen in some circles of the CV community at the moment, especially w.r.t. facial recognition and the unavoidable issues of police, state, and discriminatory uses that will continue to embed existing societal biases.

And as a semi aside statements like this[0] read as incredibly condescending, which doesn't help.

[0]: https://twitter.com/ylecun/status/1275162732166361088

smsm42 · on June 26, 2020

> It depends on what you mean by causal.

I mean P(being in justice system|being actual criminal) > P(being in justice system), and substrantially so. Moreover, P(being criminal|being in justice system) > P(being criminal). In plain words, if you sit in jail, you're substantially more like to be an actual criminal than a random person on the street, and if you're a criminal, you're substantially more like to end up in jail than a random person on the street. That's what I see as causal relationship. Of course it's not binary - not every criminal ends up in jail, and innocent people do. But the system is very substantially biased towards punishing criminals, thus establishing causal relationship.

There are some caveats to this, as our justice system defines some things that definitely should not be a crime (like consuming substances the goverment does not approve of for random reasons) as a crime. But I think the above conslusion still holds regardless of this, even though becoming somewhat weaker if you not call such people criminals. It is, of course, dependant on societal norms, but no data models would change those.

> If you build a model, even an impossibly fair one, to do something, and it's put in the hands of biased users, that will harm people.

That is certainly possible. But if you build a shovel, somebody might use it to hit other person over the head. You can't prevent misuse of any technology. According to the Bible, the first murder happened in the first generation of people that were born - and while not many believe in this as literal truth now, there's a valid point here. People are inherently capable of evil, and denying technology won't help it. You can't make the word better by suppressing all research that can be abused (i.e. all research at all). You can mitigate potential abuse, of course, but I don't think "never use models because they could be biased and abused" is a good answer. "Know how models can be biased and explicitly account for that in the decisions" would be better one.

> his[0] read as incredibly condescending, which doesn't help.

Didn't read condescending to me. Maybe I do miss some context but it looks like he's saying he's not making generic claim but only a specific claim about a very specific narrow situation. Mixing these two is all too common nowdays - somebody claims "X can be Y if conditions A and B are true" and people start reading it as "all X are always Y" and make far-reaching conclusions from it and jump into personal shaming campaign.