I have a dumb question: isn't the quality of Machine Learning dependent of the quantity and quality of training data sets ? And if were are to utilize Google or Microsoft for ML, then the whole world ends up uploading their datasets to these giants -- which helps these companies develop even better ML systems-- like a network effect but for data ? Would these companies not have a tremendous advantage against any challengers or competition? Would this system not lead to some sort of Oligopoly ?
Some people ask ML to do patently implausible things. Can you determine whether someone is a criminal from a photo of their face? It should be INCREDIBLY obvious that the answer to that question must be "no." Even if you do manage to guess correctly, there is has to be some confound, either a technical one (criminal training data is lit differently from the non-criminal ones) or a statistical one (e.g., correlation with socioeconomic status).
Best spoken in Clint Eastwood voice: "Just look them in the eye." :-)
This is certainly a problem, made worse by AI hype in the press. People expect superhuman performance where it doesn't exist. Substantial part of the job when consulting in this field is bringing down the expectations to something achievable, preferably without ruining the client's willingness to pay.
There is /no/ evidence that physiognomy or its cousin phrenology, the idea that scalp shape carries information, "work." I normally appreciate wikipedia's NPOV stance, but it's absurd that it takes two paragraphs to mention that it is universally (or almost, apparently) regarded as psuedo-science. I'm a neuroscience researcher, and I can't think of a single colleague who puts any credence in these ideas; in fact, I know several who use them as insults. As for the data, the hair-whorl things have been pretty aggressively debunked. The "gaydar" results were driven by individual choices in fashion, grooming, etc. I don't know if anyone has followed up on the hockey data, but...it doesn't matter, because of the second problem.
There's a giant leap between detecting actual criminals and people who look like members of groups that are, statistically more/less likely to be involved in crime. You just can't jump between group-level priors and individual predictions. This is especially true when some of the factors shouldn't legally or morally be used to make predictions.
Finally, think about how weird the biology would need to be for this to work. You'd need to have an underlying factor (genetic, presumably) that affects both facial structure and behavior. It would need to have a strong enough effect to reliably overcome all of the other factors that also determine someone's appearance and behavior. It's not totally impossible, but it's an extraordinary claim that would require extraordinary evidence and to date, no one has found much of anything.
Okay, "controversial" was much too generous and junk is more accurate. I just wanted to point out that it is not something some ML research came up with on the spot.
'Can you determine whether someone is a criminal from a photo of their face? It should be INCREDIBLY obvious that the answer to that question must be "no."'
Why are you so sure? For one thing, I bet criminals are more likely than the law-abiding to have facial tattoos. Of course, ML analysis of faces will not have 100% accuracy, but that is usually the case in ML and statistics.
Explain the mechanism, please. It seems wildly implausible to me that someone's facial features change (let alone in a predictable way) once they've committed a crime.
However, I can easily imagine this "working" by keying off things like age/race/gender, which will get you a value that's "better than chance" but isn't, really. Ditto for differences in the photos; nobody smiles in a mugshot, after all.
Your facial features don't change after you do a crime (except maybe for some new prison tats). They don't need to change to keep criminal classification from head shots.
Age, race, gender, income, attractiveness, testosterone levels, gang membership, attention to grooming, addiction, etc. are all predictive of crime, and part of the face picture you use.
Sure, there is some bias in how you label a criminal (convicted of a crime, self-report, etc.), but prediction is possible (without exploiting leakage like smiles or lighting).
Just from a hot-or-not rating I can make an educated guess of your conviction rate and sentencing length.
You made my point for me as you slipped from "criminal classification" to "predictive of crime". They are /NOT/ equivalent.
I have absolutely no doubt that features like age, race, and gender can be associated, at a group level, with crime. I'm also sure all of these can be extracted from face images. At the same time, these data are obviously not enough to make subject-level predictions. At best, this is a prior and even that is contaminated with all kinds of systematic biases.
Suppose you're hiring. You are certainly allowed—and sometimes required—to not hire criminals. If you systematically avoid hiring people with features "predictive of crime", an employment lawyer is going to slap you into tomorrow with a totally justified, slam-dunk of an employment discrimination case.
A very smart question, and a legitimate concern as well.
This is precisely why I am generally against ML / AI in use to make important decisions, from what your insurance rate "should" be to how many taxes "need" to be collected.
For what it's worth, I think ML / AI will fall by the wayside when -- and only when -- companies realize it's human learning and natural intelligence, when grouped together -- run in parallel -- that is ultimately a more effective pattern-finder that will produce more meaningful data.
Ever notice how fast a group of internet vigilantes can 'doxx' someone? What if that was turned on other problems? That's real human intelligence in action.
absolutely. this is why the break up big companies strategy doesn’t work here at all and neither does free market competition as it’s nominally supposed to.
Are these machine learning courses the only courses like this that google provides? Is it a new thing for google to offer courses like this?
I admire their attempt to provide translations, yet it does not seem like all languages listed are available. I.e. it seems to work much better for Spanish than German.
Google has similar courses for web development and Android development among others (https://developers.google.com/training/). Many of these have made the front page of HN at some point, but ML courses tend to get upvoted more often than web development or general programming.