Just watched the first video, it's a very nice intro to the mathematics behind neural networks, personally I like Jeremy & Rachel Howards video course [1] a lot better because it is much more geared towards practical applications but if you're into the theory end of things or if you simply want to understand better what is going on behind the scenes this video is really quite good and well worth your time.
Patrick Winston is a great lecturer -- and it's worth it to do the whole 6.034 course if you have the time. A lot of the topics he goes over are considered "out of style" now, but sometimes old-school AI algorithms find their use in new places [0]…
It's funny because in the lecture itself he discussed how the idea of neural networks was once on the chopping board of the course, but they decided to keep it in just to avoid people reinventing the wheel.
It's amazing how a lot of the basics were written about in textbooks from the 80's and even earlier, but how some recent works by people such as George Hinton have reinvigorated the field and actually enabled practical applications.
I think Hinton himself was there in the 80s pioneering neural networks, and he stuck with it this whole time. He was one of the first researchers to demonstrate the use of backpropagation to train neural networks.
It's amazing how far we have come in the past decade. I took 6.034 with Professor Winston in Fall 2006 (and agree completely that he is an amazing teacher). I remember there being only one lecture covering Neural Networks on which was remarked that they were interesting in theory but disappointing in practice.
Agreed, though I draw the parallel elsewhere. The industrial revolution was about automating labor, though really it just scaled up repeatable processes. Likewise, computer programs don't really automate mental labor, they just scale up repeatable processes once the mental labor of figuring out the specification is done. Deep learning, on the other hand, promises the automation of actual mental labor--creating new information from other, unrelated information. If you look at it that way, its role and future seems pretty obvious.
Why the device-centric view? (Not that there is anything wrong with that, but, you know, it's just an incremental innovation curve). Why not:
- Cybernetics (incl. Szilard, Von Neumann)
- Information theory (incl. Shannon, Turing)
- Whatever you might call LISP
- Overlapping window based BLT GUIs
Also, if you can find an ancient copy of Thomas Calculus when it was two volumes. Buy the first volume. Solve all the odd problems, the solutions are in the back. That's how I learned!
A later edition of Thomas was a single volume big thick yellowish book. It was very good. We used it as freshmen at MIT. You could learn from it. I don't remember if that version had answers in it. But solving problems and having answers to check against is key.
[1] http://course.fast.ai/