AI Pioneer Wants to Build the Renaissance Machine of the Future

cr0sh · on Jan 16, 2017

Honestly, I think the title from the New York Times as "the father of AI" is a bit presumptuous. I'm not an expert by any means in machine learning or artificial intelligence, but I do know a fair amount about computer history.

Potential "father of modern AI" - even that is stretching a bit!

The fact is, machine learning and artificial intelligence is a story in history of fits and starts; of springs and winters; of successes and failures.

If anyone could be called such, the fathers of AI would belong to Warren McCulloch and Walter Pitts in 1943, who came up with the model for an artificial neuron:

https://en.wikipedia.org/wiki/Artificial_neuron

From there, it was people standing on each other's shoulders, building ever upward and outward. These steps and systems have ultimately led to today's deep learning networks.

Other machine learning techniques trace to various techniques and methods in statistics and probability theory; then you have the whole arena of computer/machine vision research.

Right now, we're in the midst of yet another AI/ML "spring", after a fairly long "winter". Mr. Schmidhuber can certainly claim a bit of status as being one of many who help institute a thaw leading to today - but he is by no means alone (I'd argue that one of the earliest for today's thaw might be Lecun).

YeGoblynQueenne · on Jan 17, 2017

>> If anyone could be called such, the fathers of AI would belong to Warren McCulloch and Walter Pitts in 1943, who came up with the model for an artificial neuron

The work of McCulloch and Pitts was seminal in the field of connectionism, itself a sub-field of AI (and the field that encompasses neural networks, but not machine learning in general). The Pitts-McCulloch neuron was only one of their many contributions. Pitts in particular later published a great deal on inductive inference, which is more straight-forward, Good, Old-Fashioned AI.

AI goes way back, with the work of Turing and Von Neumann, and even Russel and the Logicians- not a Ska band, but a group of mathematicians who wrote about symbolic logic: Hillbert and Gödel being the most well-known among them. We have digital computers today, because these gentlemen (and some ladies among them) came up with the maths that enabled it. In their time, "artificial intelligence" pretty much meant a machine like the one we're using right now to communicate.

There was a small army of bright personalities that followed in their wake. Just off the top of my head (and completely arbitrarily; most of them are my personal heroes of AI):

  Marvin Minsky
  Roger Schank
  Mark E. Gold
  Dana Angluin
  J. Ross Quinlan
  Peter Norvig
  Eugene Charniak
  David H. Warren
  Daniel Jurafsky
  Robert A. Kowalksi
  John McCarthy
  Steve Russel
  Christopher D. Manning
  Terry Winograd
  Edward Feigenbaum
  Geoff Hinton
  Carl Eddie Hewitt
  Richard O' Keefe
  Pat Langley
  George F. Luger
  Ryszard S. Michalski
  Seymour Aubrey Papert
  Judea Pearl
  Fernando Pereira
  Steven Pinker
  Frank Rosenblatt
  David Everett Rumelhart
  James Lloyd McClelland
  Stuart Russel
  John Rogers Searle
  Rodney Allen Brooks
  Claude bloody Shannon
  Paul Smolensky
  Gerald Jay Sussman
  Richard S. Sutton
  Katia Sycara
  Leslie Gabriel Valiant
  Vladimir Naumovich Vapnik
  Norbert Wiener
  Joseph Weizenbaum
  Paul J. Werbos
  James H. Martin
  Hinrich Schütze
  Ehud Shapiro
  Stephen Muggleton
  Leon Sterling
  Alain Colmerauer
  ...

I've taken care to list all those people with as much detail in their names as I could find on wikipedia. This should help you identify them and read about their contributions to the field, which I strongly suggest anyone does before attempting to discuss the history of AI, in any detail and at any depth.

nl · on Jan 17, 2017

Claude bloody Shannon ;)

I was going to say that I think Marvin Minsky deserves to be on that list, especially if we are talking about the rise and fall and rise again of neural networks. Especially the fall part.

Then I noticed he was first on the list and I'd just scrolled.

MR4D · on Jan 17, 2017

John McCarthy coined the term "artificial intelligence", so regardless of everything else, he at least fathered the term.

Some great names on that list, too!

ppod · on Jan 16, 2017

Schmidhuber has a fair claim to being the father of successful connectionist AI. Others would be Hinton and LeCun. Schmidhuber is maybe a bit of a controversial figure, he's quite open about insisting on getting his credit for his early research and upset when the credit for the connectionist resurrection goes mainly to others such as Hinton instead of him. He is certainly the inventor of LSTMs at least.

obstbraende · on Jan 16, 2017

If we want to assign credit for the LSTM to one person, his student Hochreiter is perhaps the better pick

ktRolster · on Jan 17, 2017

"In 1997, he co-authored a seminal paper that laid the groundwork for modern AI systems."

Any idea what that paper was? I thought the groundwork for modern AI systems was laid already by 1997, but maybe I'm wrong?

ma2rten · on Jan 17, 2017

Long Short-Term Memory

ttam · on Jan 16, 2017

> "Juergen Schmidhuber, often referred to as the father of AI"

What?

reading further...

> he New York Times recently referred to him as a would-be father of AI.

Clicking the link to the NYTimes article

> When A.I. Matures, It May Call Jürgen Schmidhuber ‘Dad’

English as a second language person here, but is it just me or the Bloomberg subtitle does not reflect at all what the NYT title is saying?

kobeya · on Jan 16, 2017

The phenomenon of "citogenesis":

https://xkcd.com/978/

blueyes · on Jan 16, 2017

This is a sloppy piece making grand claims about a man who likes to make grand claims about himself. By most accounts, Schmidhuber is only marginally involved with nnaisense, lending his name to the endeavor so that some of his graduate students can raise money. They have no use case, no product and frankly no business sense, which is one reason why they only money backing them comes from an unknown investor based in Madrid.

dharma1 · on Jan 16, 2017

Good to see. I had the opportunity to see Jurgen speak last year - he was excellent - entertaining, surprisingly deep and able to field any questions with humour and insight. He gets a lot of flak for being obsessed with being credited for his work, but I think he does it for historical accuracy rather than ego.

Research oriented AI startups are experiencing serious brain drain to large companies because of the money they can afford to pay, I hope this secures Nnaisense for a couple of years and they make the most of that time

ponderingHplus · on Jan 16, 2017

I attended my first NIPS this year, and found Juergen to be a very engaging speaker, with the RNN symposium organized by him and his colleagues being my favorite part of the conference. A popular phrase that was being thrown around during the conference was "learning to learn" or "meta learning", with one of the papers even being titled "learning to learn by gradient descent by gradient descent"[1] Juergen seemed very passionate about the subject and he gave a cool talk around his Godel Machine[2], and sparked interesting conversation during the panel discussion. I wouldn't be surprised if "learning to learn" or "meta learning" replaces "deep learning" as the AI-word of 2017.

[1]https://papers.nips.cc/paper/6461-learning-to-learn-by-gradi...

[2]https://en.wikipedia.org/wiki/G%C3%B6del_machine

deepnotderp · on Jan 16, 2017

Learning to learn is a deepmind paper, not schmidhuber.

ponderingHplus · on Jan 16, 2017

Agreed, but the linked paper was one of the more talked about ones during the conference, and has a fairly accessible discussion on the topic, including a history of related work in section 1.2 with references to many of Jurgen's papers.

canistr · on Jan 16, 2017

Interestingly enough, Googling "Father of AI" yields several results including John McCarthy, Marvin Minsky, and Alan Turing.

Google's provided results point at McCarthy while the first mention of Schmidhuber is at the 9th result from the New York Times.

visarga · on Jan 16, 2017

I like his ideas, especially the one about consciousness being reinforcement learning, and self being emergent by the process of data compression.

ericjang · on Jan 16, 2017

Here's some additional context from a Reddit discussion: https://www.reddit.com/r/MachineLearning/comments/5go4sa/n_w...

fnl · on Jan 16, 2017

Not citing prior work is a grave issue in the academic world. Papers can get accepted/rejected based on their novelty status. Missing citations of central topics of a work can therefore lead to retractions. I am not an expert on the issue in this particular case, but if Schmidhuber is right, from an academic perspective, he has all the right to be pretty pissed off. I certainly would recommend to reject a paper if the author were to refuse to cite prior art.

deepnotderp · on Jan 16, 2017

This had been twisted by the nytimes, Bloomberg and everyone else. The majority of the ml community agrees that schmidhuber is in the wrong here. And this isn't group think either, he interrupted a tutorial presentation to argue about something that the presenter (Goodfellow) had already discussed with schmidhuber.

visarga · on Jan 16, 2017

In that case the technique was similar but the way it was used (purpose) was different.

> The new approach seems similar in many ways. Both approaches use "adversarial" MLPs to estimate certain probabilities and to learn to encode distributions. A difference is that the new system learns to generate a non-trivial distribution in response to statistically independent, random inputs, while good old PM learns to generate statistically independent, random outputs in response to a non-trivial distribution (by extracting mutually independent, factorial features encoding the distribution). Hence the new system essentially inverts the direction of PM - is this the main difference? Should it perhaps be called "inverse PM"?

http://media.nips.cc/nipsbooks/nipspapers/paper_files/nips27...

And this is from Ian Goodfellow himself, giving 3 counter arguments:

> Some previous work has used the general concept of having two neural networks compete. The most relevant work is predictability minimization [Schmidhuber, J. 1992]. In predictability minimization, each hidden unit in a neural network is trained to be different from the output of a second network, which predicts the value of that hidden unit given the value of all of the other hidden units. This work differs from predictability minimization in three important ways: 1) in this work, the competition between the networks is the sole training criterion, and is sufficient on its own to train the network. Predictability minimization is only a regularizer that encourages the hidden units of a neural network to be statistically independent while they accomplish some other task; it is not a primary training criterion.2) The nature of the competition is different. In predictability minimization, two networks’ outputs are compared, with one network trying to make the outputs similar and the other trying to make the outputs different. The output in question is a single scalar. In GANs, one network produces a rich, high dimensional vector that is used as the input to another network, and attempts to choose an input that the other network does not know how to process. 3) The specification of the learning process is different. Predictability minimization is described as an optimization problem with an objective function to be minimized, and learning approaches the minimum of the objective function. GANs are based on a minimax game rather than an optimization problem, and have a value function that one agent seeks to maximize and the other seeks to minimize. The game terminates at a saddle point that is a minimum with respect to one player’s strategy and a maximum with respect to the other player’s strategy.

http://papers.nips.cc/paper/5423-generative-adversarial-nets...

fnl · on Jan 20, 2017

Thanks for the details, very enlightening! But its a bit one-sided, still: What are Schmidhuber's arguments on this issue?

cosmoharrigan · on Jan 16, 2017

For more details, refer to Schmidhuber's own website: http://people.idsia.ch/~juergen/ and his excellent review paper, "Deep Learning in Neural Networks: An Overview": https://arxiv.org/pdf/1404.7828.pdf

kriro · on Jan 17, 2017

From the linked website:

"""Since age 15 or so, the main goal of professor Jürgen Schmidhuber has been to build a self-improving Artificial Intelligence (AI) smarter than himself, then retire."""

I like this approach.

lucidrains · on Jan 17, 2017

Schmidhuber cracks some hilarious jokes when he is on stage.

"The other day I gave a talk and there was just a single person in the audience, a young lady. I said young lady it's very embarrassing but apparently today I'm going to give this talk just to you. She said okay but please hurry I gotta clean up here."

mikefinley · on Jan 17, 2017

Hinton. Hands down. Just read his early work on deep nets. Unsupervised, no ensemble, nailed MNIST and self-awareness of features. And he has a great sense of humor.

choxi · on Jan 17, 2017

They sound like a startup version of Watson, it's nice to see an AI startup having some success in the industry.

partycoder · on Jan 16, 2017

There's a long line of succession... take your number, Jurgen: McCulloch, Pitts, Fukushima, Kohonen, Hopfield, Jordan, Elman, Werbos, LeCun, Hinton, Bengio... only to name very few.

You can also go even further... https://www.youtube.com/watch?v=laJX0txJc6M