Just to throw in my two cents: there's also https://utteranc.es/. It takes a bit more effort to get up and running, but if you're comfortable with your comments being powered by GitHub issues, it's a great way to go!
NMF (and matrix factorization techniques more generally) are a really underrated machine learning technique. If you have a decent intuition for linear algebra, matrix factorizations are straightforward to understand, fairly easy to interpret, and can work wonders in applications far beyond recommender systems.
As the blog post mentions, NMF has a potential applications to text mining, which I've tried out here on Reddit posts: https://eigenfoo.xyz/reddit-clusters/
Yes! I haven't read all of it yet, but from what I've seen so far, the book spends a lot of time rigorously proving mathematical properties/bounds of various bandit algorithms. I love rigor as much as the next guy, but I also like seeing code :)
If you're looking for just mathematical statistics, I liked Hogg McKean and Craig's "Introduction to Mathematical Statistics" (4th edition was much better than the ones after it, imo).
But if you're looking to learn prob/stats for applications to ML, most ML textbooks have a chapter or two reviewing the relevant stuff. I liked the first two chapters of Bishop's PRML for that.