I feel like the author, and did like the author for like year now. And I know lots of people feel this way, and we're telling each other to like "learn math, man, coding is something LLMs be doin in 10 years".
Indeed, these ML/AI papers are indeed full of cryptic writing, but in essence they are not so difficult compared to like... understanding what a FSM is and how to produce a minimal automata it from regex (hey ML guys, would like to see you do this on paper!). The greek-letters-infused-notation is what scares most people, and you know what - this notation has its origins in pre-computing age. Perhaps ppl need a new notation, or like magazines should require math geeks to also provide pseudo-code for dummies.
I also note that lot of things in ML seem to be about algorithms really, and very much about how things are engineered when implemented. Then I love graphs, discreet math, et. and were surprised to relearn that stuff like Markov chains is something very natural to me. Then the linear algebra needed for ML is not so much also - matmul, diagonal matrix, eigenvalues, inverse matrix, Hessian... wait! that's a lot already. But not so difficult, it is basically a lot of definitions. And some of these were not in the curriculums back in the day. Like... my mother does not have any recollection of learning about median and mode in statistics, even though they (with my father) attended a technical (by nature) university in the 70s.
You know, I'm starting to realize that even our professors in the university back in the day did not fully grasp what all these things were about, because I remember them reading from educational sources, and also being very punctual about the material, which is not something that s.o. who groks certain domain is going to do. And only few of them gave some actual examples why all the math nonsense could come handy. Well perhaps they are to blame that we have to le-learn, or perhaps it is the natural thing to happen in this new brave world.
There are many attempts at using attention-less approaches to the architecture, and Markov Chains being one of them. Nothing to compete with transformers yet though, AFAIK. Some people experiment with graph-based rules systems, like very smartly compressed grammars (otherwise the task is NP-Hard i think).
Indeed, these ML/AI papers are indeed full of cryptic writing, but in essence they are not so difficult compared to like... understanding what a FSM is and how to produce a minimal automata it from regex (hey ML guys, would like to see you do this on paper!). The greek-letters-infused-notation is what scares most people, and you know what - this notation has its origins in pre-computing age. Perhaps ppl need a new notation, or like magazines should require math geeks to also provide pseudo-code for dummies.
I also note that lot of things in ML seem to be about algorithms really, and very much about how things are engineered when implemented. Then I love graphs, discreet math, et. and were surprised to relearn that stuff like Markov chains is something very natural to me. Then the linear algebra needed for ML is not so much also - matmul, diagonal matrix, eigenvalues, inverse matrix, Hessian... wait! that's a lot already. But not so difficult, it is basically a lot of definitions. And some of these were not in the curriculums back in the day. Like... my mother does not have any recollection of learning about median and mode in statistics, even though they (with my father) attended a technical (by nature) university in the 70s.
You know, I'm starting to realize that even our professors in the university back in the day did not fully grasp what all these things were about, because I remember them reading from educational sources, and also being very punctual about the material, which is not something that s.o. who groks certain domain is going to do. And only few of them gave some actual examples why all the math nonsense could come handy. Well perhaps they are to blame that we have to le-learn, or perhaps it is the natural thing to happen in this new brave world.