I just showed this link to three classmates who are currently taking the course, and the common reaction was "It's a trap!"
They haven't been very satisfied with it. It's co-taught by two professors, one who teaches like it's an introduction for people who have never heard of Bayes' theorem and one who teaches like it's a graduate seminar for people who've seen it all before.
Prof Yaser S. Abu-Mostafa's Caltech course "Learning from Data" (http://work.caltech.edu/telecourse) is probably the best introductory course for really understanding the physics of how machine learning works.
I've got the book. It's a great book, even though the Machine Learning course here at Technion is more Bayesian than AML's seemingly PAC and VC-focused book.
Honestly, this is not a course that I would recommend. The most problematic part of this course is its lack of clear outline. It jumps between different fields of machine learning, which could have fundamentally different focuses and motivations, without illustrating the connections to the students. It talks about Watson-Nadaraya classifier in the second class, then we have two lectures to explain most basic naive bayes algorithm. I just don't get it.
Though it gets me confusing a lot of times, the course is useful in a way that gives me a lot of keywords to search for and read article about.Also the homeworks might be challenging some time, working through them did improve my understanding of something I might think trival before, like the linear regression stuff.
And if you are really interested, I would recommend a book, which covered most of the materials of the course while being much more organized:
Right. Some things would be explained from the basics, and some topics would be covered by referring you to obscure papers on advanced techniques in machine learning published by the professors.
I also recommend videos and lecture notes by Tom Mitchell for the same course taught in 2011[1]. His explanation to some non-trivial theoretical concepts of ML is very coherent.
I did the Stanford free online one the first time it was offered a year or so back. Was perfect -- didn't move at a blazing pace and was very lean. Great instructor, highly recommended (though I think it may have been absorbed into Coursera?).
Nope, it's still around on youtube and on the Stanford Engineering Everywhere site. The coursera version of the class is much more introductory and skips significant parts of the full Stanford version.
I'm taking this class right now and it is certainly an interesting twist on the topic - its fun to see how many different ML techniques solve variants of a single base problem that you can analyze with statistical learning theory. Also: how many different regularizations are equivalent, and how some "intuitive", ad-hoc-seem-to-work regularizations you might think up in isolation actually can be theoretically justified. It contrasts with the more traditional, also grad-level 6.867 ML class.
Gee, guys, looking at the list of
topics, a huge fraction, likely over
50%, of the material goes back to
programs in operations research,
statistics, and the mathematical
sciences from about 1970 on.
Nearly the only thing new is the
collection of sample applications.
From what I've seen, the quality
of the content of the current ML
courses is way below that going
back to 1970.
Warning: History shows that the
US economy looked at the material
in operations research, statistics,
and the mathematical sciences and
rolled their eyes, did a big upchuck,
laughed, turned, and walked away.
One might look for alarms from their
hype and fad detectors.
a good machine learning course might indeed cover 1940-1980s operations research (nonlinear optimization, linear/quadratic programming, dynamic programming), and statistics from 1970-1990s (graphical models, markov chain monte carlo methods, measures of model capacity). i'd say the field borrows the most useful bits from these fields and finds good honest use in many real life problems today. and i agree that there's a lot of unwarranted hype that leads to a lot of well-deserved skepticism.
How is this course compared to Andrew Ng's Coursera class, his regular Stanford class and Caltech's Learning from Data course? (Other ML courses available on the web in terms of depth)
I will say that I'm a huge fan of the Caltech Learning from Data course (currently also offered on EdX). I took Andrew Ng's Coursera course 2 years ago, finished it successfully, and liked it. But I feel that the Caltech course gave me a much deeper foundational understanding of the basic issues and tradeoffs, and much deeper insight into what's going on.
Homework is much better in the Caltech course, too. In the Coursera course, they give you programs and environments in Octave that are all prewritten for you, and you just need to plug in a few key lines (often there's essentially one way to do it due to dimensionality). You feel like you understand what's going on, but the understanding is not really grounded. The Caltech course has multiple choice questions, but they look like this: "implement this algorithm, run it through a data set chosen randomly with such and such parameters, calculate learning error, do all this 1000 times and average. What value out of these 5 is your learning error closest to?". You choose the language, you implement the algorithm from scratch, you debug the hell out of it, you visualize your data to understand what's wrong... then the knowledge and the understanding stay with you.
I'm currently doing both the Coursera course and the Caltech course concurrently. I really like the level and delivery style of the Caltech course. It covers a lot of material, with good depth and rigour where needed and with a lot of colour. Makes you want to jump and try the techniques out.
In contrast the Coursera course seems a bit easy and dry. I also dislike the dependency on Octave.
I can't say how this specific CMU course compares with Ng's online course, but I can comment on Ng's course as compared to several other similar courses I took in Nanyang Tech. U. and Paul Sabatier (Toulouse 3), and my overall remark is that Ng's course is quite short on the maths, which makes it not sufficiently formal to deeply understand what goes on. However, it gives enough material and code samples to play with data. It can be nicely complemented with some self-study.
It's worth clarifying that Ng's online course is essentially his Stanford course minus most of the math. The online version doesn't have the proofs or theory problem sets, while the Stanford version does. The problem sets, etc... are available at cs229.stanford.edu, if anyone is interested.
I think now it is the time we get some tutorial/resources/classes on practical implementation of these ML techniques. Enough of Introduction to ML. How to handle large data (say 6000000 rows), how to convert csv/tbv data to different formats needed for different machine learning libraries for e.g. Weka, LibSVM etc.
For an introduction to the broader realm of data input, normalization, modeling, and visualization -- in which ML plays but a part -- you can "preview" Bill Howe's "Introduction to Data Science" class on Coursera[0]; I'm working through the lectures, and I find he gives compelling explanations of what all these parts are, why they're important, and how it all fits together in a larger context.
I took Prof. Howe's course on Coursera and it's a bit of a mixed bag. I can actually see it being better in some respects just going through the content after-the-fact than taking the course as it was run as there were a number of issues with auto-grading of assignments and some of the specific tools choices (like Tableau, which only runs on Windows).
That said, the course covered a lot of ground and touched on a number of different interesting/important topics. Some of the lecture material was a bit disorganized/had errors and didn't flow all that well from one topic to another but there was a lot of good material there, especially if you had enough background to appreciate it. I was comfortable enough but it was obvious that the expectations set by the prereqs were off.
Hopefully the course will run again with most of the kinks worked out and, perhaps, a better level-setting of what's needed to get the most out of the course.
Yes! I've done quite a few online courses now and the ones I've enjoyed the most have all been ones that were essentially recordings of the normal classes students take on campus.
In the case of Harvard's CS50x, it was essentially the exact same course. (Plus it helped that David Malan is an outstanding teacher.)
They haven't been very satisfied with it. It's co-taught by two professors, one who teaches like it's an introduction for people who have never heard of Bayes' theorem and one who teaches like it's a graduate seminar for people who've seen it all before.