Author calls it "Model-Based" in place for Bayesian.
I transitioned from using Bayesian models in academia to using machine learning models in industry. One of the core differences in the two paradigms is the "feel" when constructing models. For a Bayesian model, you feel like you're constructing the model from first principles. You set your conditional probabilities and priors and see if it fits the data. I'm sure probabilistic programming languages facilitated that feeling. For machine learning models, it feels like you're starting from the loss function and working back to get the best configuration.
Much of the underlying machinery behind Bayesian vs. machine learning models is the same. Hidden Markov Models are Hidden Markov Models whether they have a prior or not. But this difference in feel influences how you build models and hence, the results.
Now that optimization algos for Bayesian models are catching up, Bayesian ML might become a thing.
The blog post author (Daniel Emaasit) wasn't the first to use the "model-based machine learning" phrase. He cites "Model-based machine learning" by Christopher M. Bishop.
Think Bayesian vs frequentist. Obviously from the perspective of "optimize against a loss function" they're both special cases, but one is mathematically and philosophically parsimonious, the other... Less so.
The authors initial feelings around ML are similar to how I feel. It's such a broad subject, it feels like you could read/study for years and not cover everything. Worse still it's ever changing.
I don't know about that. It might appear that way, if you only read news articles on ML since these usually are announcements of the state-of-the-art.
However, compare the topics covered in Duda and Hart's text (1st edition, 1973; 2nd edition 1995) to the more recent text by Hastie et al. (current edition ~2013). There isn't a huge difference in subject matter. The latter is slightly more advanced and has more of a statistics perspective, but the foundations are there: Bayes' decision theory, linear methods for classification / regression, naive bayes, neural networks, decision trees, ensembles, clustering, etc...
There is a range of topics that are foundational to ML, and thus, relatively stable. These topics are built upon the even more solid foundations of probability theory and statistics. The biggest advances in ML in the last decade (I would argue) were not due to advances in theory.
You just described all fields worth doing research in.
If we knew the right answers we wouldn't waste our time implementing the wrong ones. It's not a bad thing: it means there is room for you to discover something genuinely new and understand something, however small, that nobody else ever completely understood.
Try Pedro Domingos "The Master Algorithm". Good high level overview of the various "schools" of Machine Learning. Not sure if it identifies which problems are best solved by which approach, though. More the history of how they have taken turns as the most successful paradigm.
What is a good textbook that will take me from 0 to practical proficiency with Bayesian Nets, if I have experience with regression models, hand built and machine learned?
Do what the author did: take Koller's course and follow the research that interests you. In this case, the propagation through factor graphs is also a good analogy for when you start looking at back propagation in NNs and the autoencoder.
This is a very fragmented approach and is very frustrating to go down that road. Every field is filled with hype and people pushing their own agenda and publicity to further their careers. Speaking from experience with self learning discriminatve modeling, the overwhelming majority of the books is either superficial or needlessly complicated. In both of those situations the books are badly written as well.
Finding a book that hit the sweet spot for regressions wasn't easy but was doable. I was hoping there would be something similar with Bayesian Nets/Generative Models.
Regression Modeling Strategies. You need at least some notion of what regressions and probability are all about, but if you have the basics covered, this book will take you through 80% of the journey and the rest is some googling to figure out some concepts that might be murky.
This is a book that emphasizes practical applications without getting bent on the math details too much. If on the other hand you are a math whiz, Elements of Statistical Learning is THE book but it expects you to be very proficient in math.
Both books are seriously underrated, which is kind of funny to say because you will find only praises about them, but they deserve even more.
The math in The Elements of Statistical Learning is quite basic, if occasionally tedious. Any college junior in STEM should have taken enough calculus, linear algebra, and probability to work through it.
I disagree re: ESL and RMS, but Harrell's book is superb. It's really more aimed at biostatisticians and clinicians, though.
If the math in ESL gives you trouble, you might prefer http://www-bcf.usc.edu/~gareth/ISL/ISLR%20First%20Printing.p... (and I'm not just saying this because the first author was one of my advisers, although I do think that he and Daniela are particularly gifted teachers).
If the math in ESL is too trivial for you, there's always https://web.stanford.edu/~hastie/StatLearnSparsity_files/SLS... , which covers some graphical modeling strategies in later chapters and even kicks the tires of the autoencoder (imho perhaps the greatest recent advance in neural networks for practitioners) along the way.
Koller's course and Ng's course are also good.
Ultimately I feel like you have to get the math right or you'll never acquire the intuition that helps you design your own approaches. But you also have to put in the work.
That reminds me, tibshirani's Stanford course (accompanies ISL and ESL) is terrific. Better than those other two, actually. I wish Harrell would offer one.
>Ultimately I feel like you have to get the math right or you'll never acquire the intuition that helps you design your own approaches.
But what do you mean by that? Do I really need it if my applications are not as demanding as Netflix? I feel like many people consider anything less than phd-level understanding lol worthy, which is simply not true. Majority of analysts out there are doing just fine with canned procedures. Are there something like canned procedures for Bayesian Nets?
I disagree that RMS is necessarily better than ESL in some meaningful way. The two are complementary. It's like saying a gravel truck is better than a motorcycle: it all depends on what you want to do with it.
Re: "do I really need it?": hell if I know, I'm not you. But my assertion was specifically that if you want to design your own methods (i.e. do research) you need to understand what they are doing. This doesn't seem like a controversial position; an expert is simply a master of the fundamentals.
But I thought I made it clear that I am not looking to do research. I'm not even looking for state of the art performance. I am looking for that 30% of the skills which allow me to do 70% of the tasks. Like RMS. Because outside of Google/Microsoft/Amazon et al, domain knowledge beats superior math skills 10 out of 10 times.
You should do whatever you like, but remember that domain knowledge and math skills are not mutually exclusive. If the problem you need to solve for a major customer or project happens to be in that 30%, it may come in handy.
Linear algebra and calculus (to a lesser degree) are foundational for a great many things. Got missing data? K-NN or nuclear norm matrix completion (or marginalizing over the rest) can help. Systems of differential equations? Use a matrix exponential.
You are free to do whatever you like. A bus driver doesn't need to know how to rebuild an engine. But if you want to race cars you'll get a lot further if you do know how.
If I have missing values, I can use multiple imputation or even simple averaging and the hit I will take will be negligible. I simply do not work in the remaining 30%. I can tell if the work is going to be above my head and refuse the project in those cases.
So after all this back and forth, I still don't know if there is a book similar to RMS in scope, for Bayesian Nets.
Just to touch on one part of your question -- did you see the case study section? There are five conclusions:
> 1. This approach provides a systematic process of developing bespoke models tailored to our specific problem.
> 2. It provides transparency to our model as we explicitly defined our model assumptions by leveraging prior knowledge about traffic congestion.
> 3. The approach allows handling of uncertainty in a principled manner using probability theory.
> 4. It does not suffer from overfitting as the model parameters are learned using Bayesian inference and not optimization.
> 5. Finally, MBML separates the model development from inference which allows us to build several models and use the same inference algorithm to learn the model parameters. This in turn helps to quickly compare several alternative models and select the best model that is explained by the observed data.
It's just a blog article clearly titled "An Introduction to Model-Based Machine Learning". While the questions you ask are certainly useful, there was no indication the answers would be found in the article.
I transitioned from using Bayesian models in academia to using machine learning models in industry. One of the core differences in the two paradigms is the "feel" when constructing models. For a Bayesian model, you feel like you're constructing the model from first principles. You set your conditional probabilities and priors and see if it fits the data. I'm sure probabilistic programming languages facilitated that feeling. For machine learning models, it feels like you're starting from the loss function and working back to get the best configuration.
Much of the underlying machinery behind Bayesian vs. machine learning models is the same. Hidden Markov Models are Hidden Markov Models whether they have a prior or not. But this difference in feel influences how you build models and hence, the results.
Now that optimization algos for Bayesian models are catching up, Bayesian ML might become a thing.
Cool stuff.