> You don't need to be more good in math than in high school. I'm very tired of ...

programjames · 2024-08-06T00:23:37 1722903817

I know many people who did take multivariate calculus, group/ring theory, and thermodynamics in high school, and think this should be the norm. I believe I consider "high school" math what most people consider "undergraduate", and everything up to linear algebra goes under "middle school" in my mental model (ages 12-14). So, I'm probably one of those people propagating, "ML math is easy, you only need a high school knowledge!" but I acknowledge that's still more than most people ever learn.

andrewflnr · 2024-08-06T02:09:14 1722910154

> I acknowledge that's still more than most people ever learn.

So you know you're wrong as a matter of plain fact, but you're going to continue to spout your "mental model" as truth anyway?

What are you trying to say here? It doesn't matter much what "should" be high school knowledge unless you're designing curriculum. If no one actually learns it in high school then a phrase like "you only need high school knowledge" means nothing to most people.

godelski · 2024-08-06T07:12:23 1722928343

  > "you only need high school knowledge" means nothing to most people.

I think people intend to use it to tell people the barrier is low. But by trivializing the difficulties of calculus (may be easy now, but was it before you learned it?), you place that barrier higher than it was before. The result is the opposite of the intent.

I'll even state it now, as someone who highly advocates for learning math:

  You don't even need calculus to build good models. At most, a rudimentary understanding of algebra, but I'm not sure even that. A little programming skill, which can freely and easily be obtained, is all that is necessary to begin. So if you can read and can motivate yourself, you can build good and useful models. It might just take longer if you don't have these yet.

With that said, be cautious that you fall victim to your success. The barrier to entry may be (very) low, but it is a long way to the top. So don't ignore the fundamentals and use your excitement and success to motivate yourself through the boring and hard parts. Unfortunately, there's a steep curve to reap the rewards of your math knowledge (in ML. You'll reap rewards even in daily life much sooner!). But it is well worth it. ML is only a magical black box because you have not achieved this yet (this does not mean ML becomes a white box). Not knowing what you don't know makes it hard to progress. But I promise math will help illuminate things (e.g. understanding when and where CNNs vs transformers should be used inside architectures; how many parameters you need in hidden layers; how to make your models robust; why they fail; how to identify where they will fail before it happens; and much more. These are enormously helpful, and more if you wish to build products and not just research papers or blogs. If models are black boxes due to the composition of easily and well understood functions, I think you can probably guess how small and subtle changes can have large effects on performance. You'll at least learn a bit about this concept (chaos) in differential equations).

programjames · 2024-08-06T05:16:13 1722921373

As I said,

> I know many people who did take...

In fact, the vast majority of my friends did, so my mental model is more useful to me than one that apportions a larger cut to the rest of the population. I also find it egregious that thirteen years of schooling doesn't get everyone to this level, so I want to hold the education system accountable by not loosening my standard.

> If [almost] no one actually learns it in high school then a phrase like "you only need high school knowledge" means nothing to most people.

I agree that this isn't as good at conveying information (unless the consensus changes), but that's not all I'm trying to do.

godelski · 2024-08-06T06:31:13 1722925873

  > so my mental model

This is the point though. If you know your mental model is wrong, you should update your model rather that perpetuate the errors. It's okay to be wrong and no one is upset at you for being wrong (at least not me). But if you are knowingly wrong, don't try to justify it, use the signal to help you change your model. I know it isn't easy, but recognize that defending your bad model makes this harder. It is okay to admit fault and you'll often be surprised how this can turn a conversation around. (FWIW, I think a lot of people struggle with this, including me. This comment is even me trying to reinforce this behavior in myself. But I think you will also be receptive because I think your intent and words diverged; I hope I can be part of that feedback signal that so many provided to me.)

  > so I want to hold the education system accountable

So hold them accountable, not the people in these. I think you intend to blame the system, but I think if you read your message carefully, you'll see a very reasonable interpretation is that you're blaming the person. It is because you're suggesting this is a level of math that everyone should know.

For a frame of reference the high school I went to is (currently) in the top 20% of CA and top 10% of the country. I checked their listings and while there's a 50% participation rate in AP (they also have IB), they do not offer Linear Algebra or anything past Calc I. So I think this should help you update your model to consider what opportunities people have. I think this is especially important because we should distinguish opportunity from potential and skill. I firmly believe metrics hinder the chance of any form of meritocracy in part due to the fact that opportunity is so disproportionate (more so due to to the fact that metrics are models. And you know what they say about all models ;).

If we want to actually make a better society and smarter population, we should not be diminutive to people for the lack of opportunities that are out of their control. Instead I think we should recognize this and make sure that we are not the ones denying opportunities. If we talk about education, (with exception at the extreme ends) I think we can recognize that the difference between a top tier high school student and a bit below average, is not that huge. Post undergrad it certainly grows, but I don't think it is that large either. So I'm just saying, a bit of compassion goes a long ways. Opportunity compounds, so the earlier the better. I'm fond of the phrase "the harder I work, the luckier I get" because your hard work does contribute to your success, but it isn't the only factor[0]. We know "advanced" math, so we know nothing in real life is univariate, right? You work hard so that you may take advantage of opportunities that come your way, but the cards you are dealt are out of your control. And personally, I think we should do our best to ensure that we the dominating factor that determines outcome is due to what someone can actually control. And more importantly, we recognize how things compound (also [0]).

I'm not mad or angry with you. But I think you should take a second to reevaluate your model. I'm sure it has utility, but I'm sure you're not always in a setting where it is useful (like now). If you are, at least recognize how extreme your bubble is.

[0] I highly suggest watching, even if you've seen it before. https://www.youtube.com/watch?v=3LopI4YeC4I

programjames · 2024-08-06T09:03:27 1722935007

> I think we should do our best to ensure that we the dominating factor that determines outcome is due to what someone can actually control.

I think this is where I'm coming from as well. When I got to university, I met tons of people who were just connected to the right resources, be they textbooks, summer camps, math tutors, or college counselors. My lucky break was a smart father and learning about AoPS in 4th grade, but I still wish I knew what else was out there.

It'd be great if people didn't need to get lucky to learn this stuff. There is a whole group of people paid to set standards and make people aware of what is out there. The standards filter down from the board of education to the teachers, and the teachers don't actually have much sway in what they teach (re: r/teachers). So, my ultimate goal for imposing my definition of "high school math" on everyone else is to make it common enough that the standards reflect that, rather than a slow trend of weakening standards that has happened in the past few decades[*].

But... now that I type this all out, it seems pretty far removed, and probably does more harm than good (except in my bubble). It'd be much more effective to send a few emails or get myself elected to one of these seats.

[*]: Note, standards have obviously risen since the early 1900s, but they've actually fallen in the last twenty years.

godelski · 2024-08-06T19:30:51 1722972651

And I think you hit on a good point. It's harder to know what we don't know and easier to think we do.

Another thing I think about a lot is Hanlon's Razor. I think people misinterpret it because the term "stupidity" is often used harsher. Maybe because we don't want to admit that we're all stupid[0]. I also despise the phrase "good enough" because many times the details do matter. It's not that we shouldn't approximate things, but that we often use this phrase to dismiss nuance. It's an unfortunate consequence of weak feedback signals. In the same way as if you fix something before it is broken you will save a lot of time and money, but the signal isn't apparent because you don't have the counterfactual experience of running around like a chicken who's lost it's head when shit hits the fan (but you probably have somewhere else).

So, we're all wrong, right? It's not like there's anything objective we can actually hold onto. But there are things that are less wrong, and that is meaningful. But the environment moves and time marches, so it's important we ensure what was once less wrong doesn't become more wrong. If our beliefs are unmoving, we'll only become more wrong with time. And that doesn't seem to be any of our goals and this fact doesn't sit well with our ego that creates that unmoving force in us.

FWIW, I'm highly in favor of younger kids learning much more difficult math. I've seen studies of how kids as young as 5 can learn core concepts of calculus (hell Terrance Tao exists). I'll settle for Middle School lol.

[0] In one framing I think our stupidity is impressive. We're bumbling chimps who can barely communicate with one another. Yet look what we've accomplished! I find this quite motivating, but it does not allow me to forget how foolish I am. Besides, if there's nowhere to move up, what would be the fun in that?

Krei-se · 2024-08-06T06:11:59 1722924719

I don't see any lessons here, just rambling.

godelski · 2024-08-06T06:59:05 1722927545

Then allow me to clarify:

  - Very few high schools in America offer these classes. Even fewer people take them. The lie to yourself is not recognizing your bubble. You might think you're encouraging others, but you're doing the opposite. People who had those opportunities are likely not the ones that feel like ML is beyond their capabilities. 

  - While you can be successful in ML without math, this does not mean you should discourage its pursuit (just as you shouldn't place it as a gate keeping requirement. Even Calc and LA aren't required!). 

  -  Math is about a way of thinking and approaching problems. These skills generalize beyond the ability to solve mathematical functions. 

  - The mathematical knowledge compounds and will make your models better. This may be nonobvious, especially given your suggested background, you've lived with this knowledge for quite some time. But if you haven't gone into things like statistical theory (more than ISLR), probability, metric theory, optimization, and so on, it is quite difficult to see how these help you in the same way it's hard to see what's on a shelf above you. It can also be difficult to explain how these help if you lack the language. But if you want to build good products (that work in the real world and not just in a demo), you'll find this knowledge is invaluable. If you don't understand why, let this be a signal of your overconfidence. Models aren't worth shit if they don't generalize (I'm not talking about AGI, I'm talking about generalizing to customer data)[0].

[0] Being an ML researcher, I specifically have a horse in this race. The more half assed scam products (e.g. Rabbit, Devin, etc) that get out there, the more the public turns to believing ML is another Silicon Valley hype scam. Hype is (unfortunately) essential and allows for bootstrapping, but the game is to replace the bubble before it pops. The more you put into that bubble the more money comes, but also the more ground you have to make up, and the less time you have to do so. Success is the bubble popping without anyone noticing, not how loud it pops.

Krei-se · 2024-08-07T05:06:23 1723007183

Hey godelski, thanks for actually taking the time to clarify on this and it's always great to be taken seriously. I have not really much to add, but i would love to see your energy in educating the people that are commenting here.

I like this page a lot: https://d2l.ai/chapter_preliminaries/calculus.html And i also learned a lot from ben eater (Youtube and this: https://eater.net/quaternions) Category theory was a godsend from Milewski, esp. his PDF is great! https://bartoszmilewski.com/2014/10/28/category-theory-for-p...

Maybe you can add a little and offer ressources from your background?

godelski · 2024-08-07T21:21:25 1723065685

Thanks. I am passionate haha. I do like to educate, but it is often difficult to do on these forms as I do not know peoples backgrounds. If you throw too high level at them, I find it does more harm than good. I also find that while I know many here took things like calculus, that we also need to recognize that skills degrade with time (though can be regained more easily).

I do understand that it is difficult to find the math pathways in ML. What I do suggest is focusing on generative models[0], but the important part is to keep asking "why" and "what does this actually mean?" I think it is too easy to get stuck in knowing the answer and accepting it without understanding that there is no "right" answer, but less wrong answers. The path gets clearer if this attitude is adopted. This is where I've found teaching highly beneficial[1], as you face the basics and you'll find many questions that are easy to dismiss. To be a good teacher you have to take dumb questions seriously to determine if they're actually dumb or "dumb"[2]. Things you probably asked when learning: "how many layers?", "how many neurons per layer?", "is the optimization space actually smooth like the 2d decent graphs we saw?", "does data lie on a manifold?", "what is a manifold", "does data always lie on a lower dimensional manifold?", "is data always a distribution?", "is my data actually representative of my goals?", "what does this measurement mean?", "what does this measurement not tell me?", and so on. These are all extremely important questions that are almost universally ignored, but that all appear to have simple answers.

The reason I do like category theory (Bartosz also has video lectures for those), is because it helps connect the many different disciplines of math that are needed to answer some of these. To see the generalizations of things like surjective (epimorphic) and injective (monomorphic) functions, plays a role in answering the layer and neuron questions. It was the way I could start understanding how field theory wasn't just cool but impractical math.

But to ML, I think there's this hard gap and I'm not sure of a good resource that fills it (I'm working on one myself). That there's lots of basics (blogs like "the math behind transformers" that show the equations and little to no more) and there's also plenty at a high level by experts in cat theory, set theory, algebraic geometry, or others. The former aren't very useful and the latter can be easily found if you have the requisite knowledge but are impenetrable if you don't.

But with diffusion models and score matching being all the rage now, I highly suggest reading into Aapo Hyvärinen's[3] work. At lower levels I suggest Gelman's book and/or McElreath's. Needham has impressive illustrations and writes so well that you can only emulate because he's at a different level. I found Shao's Mathematical Statistics greatly helpful, but this is not easy to parse. Gallier and Quaintance are also worth looking into. But if you need something on the easier side, Tomczak is your friend. On the YouTube side I'll recommend some that are more easily missed: mathemaniac, EpsilonDeltaMain, jHan, ron-math, and Mutual_Information. You should find more from these too. Also search the #some{,2,3} tags, there's a lot of hidden gems. There's also a Cats4AI group btw, and many of them have now created a startup.

[0] Realistically all models are generative. The definition is ill-defined and you can even demonstrate that classifiers are EBMs. https://arxiv.org/abs/1912.03263

[1] I'm finishing a PhD program, so I do teach a ML course

[2] In either case you have to answer nicely. But it is easy to trick yourself into thinking something is simple when it is not.

[3] https://www.cs.helsinki.fi/u/ahyvarin/papers/

Krei-se · 2024-08-07T05:10:30 1723007430

/ quote / But if you haven't gone into things like statistical theory (more than ISLR), probability, metric theory, optimization, and so on, it is quite difficult to see how these help you in the same way it's hard to see what's on a shelf above you. It can also be difficult to explain how these help if you lack the language.

That is actually correct, i miss those steps - if you can recommend anything besides d2l.ai (which i haven't finished yet) let me know! Enjoy your summer and train those smiling-muscles every once in a while.

Onavo · 2024-08-05T21:51:34 1722894694

> 1) I know one (ONE) person who took multivariate calculus in high school.

Unless you are specifically dealing with intractable Bayesian integral problems, the multivariate calculus involved in NNs are primarily differentiation, not integration. The fun problems like boundary conditions and Stokes/Green that makes up the meat of multivariable calculus don't truly apply when you are dealing with differentiation only. In other words you only need the parts of calc 2/3 that can be taught in an afternoon, not the truly difficult parts.

> I'm sure many here recognize the reference[0], but being able to make a model that performs successfully on a test set[1] is not always meaningful. (sic) ...[2] If you want to fight me on this, at least demonstrate to me you have taken an abstract algebra course and understand ideals and rings. Even better if axioms and set theory.

Doesn't matter, if it creates value, it is sufficiently correct for all intents and purposes. Pray tell me how discrete math and abstract algebra has anything to do with day to day ML research. If you want to appeal to physics sure, plenty of Ising models, energy functions, and belief propagation in ML but you have lost all credibility bringing up discrete math.

Again those correlation tests you use to fact check your model are primarily linear frequentist models. Most statistics practitioners outside of graduate research will just be plugging formulas, not doing research level proofs.

> Just because the simplified view is not mathematically intensive does not mean math isn't important nor does it mean there isn't extremely complex mathematics under the hood. You're only explaining the mathematics in a simple way that is only about the updating process. There's a lot more to ML.

Are you sure? The traditional linear algebra (and similar) models never (or rarely) outperformed neural networks, except perhaps on efficiency, absent hardware acceleration and all other things being equal. A flapping bird wing is beautiful from a bioengineering point of view but the aerospace industry is powered by dumb (mostly) static airfoils. Just because something is elegant doesn't mean it solves problems. A scaled up CNN is about as boring a NN can get, yet it beats the pants off all those traditional computer vision algorithms that I am sure contain way more "discrete math and abstract algebra".

That being said, more knowledge is always a good thing, but I am not naive enough to believe that ML research can only be advanced by people with "mathematical maturity". It's still in the highly empirical stage where we experimentation (regardless of whether it's guided by mathematical intuition) dominates. I have seen plenty of interesting ML results from folks who don't know what ELBOs and KL divergences are.

godelski · 2024-08-05T23:32:49 1722900769

  > intractable Bayesian integral problems

With ML, most of what we are doing is modeling intractable distributions...

  > the multivariate calculus involved in NNs are primarily differentiation

Sure, but I'm not sure what your critique is here. This is confirming my point. Maybe I should have been clearer by adding a line that most people do not take calculus in high school. While it is offered there, these are the advance courses, and I'd be wary of being so pejorative. I know a large number of great mathematicians, computer scientists, and physicists who did not take calculus in high school. I don't think we need to discourage anyone or needlessly make them feel dumb. I'd rather encourage more to undertake further math education and I believe the lessons learned from calculus are highly beneficial in real world every day usage, without requiring explicit formula writing (as referenced in my prior post).

Which as a side note, I've found this is an important point and one of the most difficult lessons to learn to be an effective math teacher: Once you understand something, it often seems obvious and it is easy to forget how much you struggled to get to that point. If you can remember the struggle, you will be a better teacher. I also encourage teaching as revisiting can reveal the holes in your knowledge and often overconfidence (but the problem repeats as you teach a course for a long time). Clearly this is something that Feynman recognized and lead to his famous studying technique.

  > Doesn't matter, if it creates value

Value is too abstract and I think you should clarify. If you need a mine, digging it with a spoon creates value. But I don't understand your argument here and it appears to me that you also don't agree since you later discuss traditional (presumably GLMs?) statistics models vs ML. This argument seems to suggest that both create value but one creates _more_ value. And in this sense, yes I agree that it is important to consider what has more value. After all, isn't all of this under the broad scope of optimization? ;)

  > Pray tell me how discrete math and abstract algebra has anything to do with day to day ML research.

Since we both answered the first part I'll address the second. First, I'm not sure I claimed abstract algebra was necessary, but that's a comment about if you were going to argue with me about "math being a language". So miscommunication. Second off, there's quite a lot of research on equivalent networks, gradient analysis, interpretability, and so on that does require knowledge of fields, groups, rings, sets, and I'll even include measure theory. Like how you answered the first part, there's a fair amount of statistics.

  > Most statistics practitioners outside of graduate research will just be plugging formulas

And? I may be misinterpreting, but this argument suggests to me that you believe that this effort was fruitless. But I think you discount that the knowledge gained from this is what enables one to know which tools to use. Again, referencing the prior point in not needing to explicitly write equations. The knowledge gained is still valuable and I believe that through mathematics is the best way we have to teach these lessons in a generalizable manner. And personally I'd argue that it is common to use the wrong tools due to lack of nuanced understanding and one's natural tendency to get lazy (we all do it, including me). So even if a novice could use a flow chart for analysis, I hope we both realize how often the errors will appear. And how these types of errors will __devalue__ the task.

I think there is also an issue with how one analyzes value and reward. We're in a complicated enough society -- certainly a field -- that it is frequent for costs to be outsourced to others and to time. It is frequent to gain reward immediately or in the short term but have overall negative rewards in the medium to long term. It is unfortunate that these feedback signals degrade (noise) with time, but that is the reality of the world. I can even give day to day examples if you want (as well as calc), but this is long enough.

  > Are you sure? The traditional linear algebra (and similar) models never (or rarely) outperformed neural networks

I don't know how to address this because I'm not sure where I made this claim. Though I will say that there are plenty of problems where traditional methods do win out, where xgboost is better, and that computational costs are a factor in real world settings. But it is all about context. There's no strictly dominating method. But I just don't think I understand your argument because it feels non-sequitur.

  > A flapping bird wing...  [vs] static airfoils.

I think this example better clarifies your lack of understanding in areospace engineering rather than your argument. I'm guessing you're making this conclusion due to observation rather than from principles. There is a lot of research that goes into ornithopters, and this is not due to aesthetics. But again, context matters; there is no strictly dominating method.

I think miscommunication is happening on this point due to a difference in usage of "elegance." If we reference MW, I believe you are using it with definition 1c while I'm using it with 1d. As in, it isn't just aesthetics. There's good reason nature went down this path instead of another. It's the same reason the context matters, because all optimization problems are solved under constraints. Solution spaces are also quite large, and as we've referenced before, in these large intractable spaces, there's usually no global optima. This is often even true in highly constrained problems.

  > more knowledge is always a good thing

Glad we agree. I hope we all try to continually learn and challenge our own beliefs. I do want to ensure we recognize the parts of our positions that we agree upon and not strictly focus on the differentiation.

  > ML research can only be advanced by people with "mathematical maturity"

No such claim was ever made and I will never make such a claim. Nor will I make such a claim about any field. If you think it has, I'd suggest taking a second to cool off and reread what I wrote with this context in mind. Perhaps we'll be in much more agreement then. (specifically what I tell my students and the meaning of the referenced "all models are wrong but some models are useful".) Misinterpretation has occurred. The fault can be mine, but I'm lacking the words to adequately clarify so I hope this can do so. I'm sorry to outsource the work to you, but I did try to revise and found it lacking. I think this will likely be more efficient. I do think this is miscommunication on both sides and I hope we both can try to minimize this.

Onavo · 2024-08-05T23:45:19 1722901519

> With ML, most of what we are doing is modeling intractable distributions...

I am aware and we specifically don't directly compute those because they are intractable, hence rendering the need for low level ML practitioners to be familiar with their theoretical properties to be mostly unnecessary. MCMC exists for a reason and modern deep learning contains almost zero direct integration. There are lots of sampling but few integrals.

I have seen high schoolers use and implement VAEs without understanding what the reparametrization trick is.

> Value is too abstract and I think you should clarify

The value of LLMs and similar deep learning classifiers/generators is self evident. If your research is only good for publishing papers, you should stay in academia. You are in no position to judge or gatekeep ML research.

> I think this example better clarifies your lack of understanding in areospace engineering rather than your argument.

I am a pilot, software engineer, and a machine learning practitioner with plenty of interdisciplinary training in other scientific fields. I assure you I am more than familiar with the basics of fluid dynamics and flight principles. Granny knows how to suck eggs, no need for the lecture.

> First, I'm not sure I claimed abstract algebra was necessary, but that's a comment about if you were going to argue with me about "math being a language"

You claimed that people needed to know rings, groups and set theory to debate you on understanding ML. ̶I̶ ̶t̶h̶i̶n̶k̶ ̶y̶o̶u̶ ̶a̶r̶e̶ ̶t̶h̶e̶ ̶o̶n̶e̶ ̶w̶h̶o̶ ̶n̶e̶e̶d̶s̶ ̶t̶o̶ ̶g̶o̶ ̶b̶a̶c̶k̶ ̶t̶o̶ ̶s̶c̶h̶o̶o̶l̶ ̶a̶n̶d̶ ̶s̶t̶o̶p̶ ̶g̶a̶t̶e̶ ̶k̶e̶e̶p̶i̶n̶g̶.̶ ̶ ̶Y̶o̶u̶ ̶r̶e̶m̶i̶n̶d̶ ̶m̶e̶ ̶o̶f̶ ̶t̶h̶o̶s̶e̶ ̶f̶u̶n̶c̶t̶i̶o̶n̶a̶l̶ ̶p̶r̶o̶g̶r̶a̶m̶m̶e̶r̶s̶ ̶w̶h̶o̶ ̶w̶o̶u̶l̶d̶ ̶r̶e̶w̶r̶i̶t̶e̶ ̶n̶e̶u̶r̶a̶l̶ ̶n̶e̶t̶w̶o̶r̶k̶ ̶l̶i̶b̶r̶a̶r̶i̶e̶s̶ ̶i̶n̶ ̶H̶a̶s̶k̶e̶l̶l̶ ̶b̶e̶l̶i̶e̶v̶i̶n̶g̶ ̶c̶a̶t̶e̶g̶o̶r̶y̶ ̶t̶h̶e̶o̶r̶y̶ ̶w̶o̶u̶l̶d̶ ̶u̶n̶l̶o̶c̶k̶ ̶s̶o̶m̶e̶ ̶m̶a̶g̶i̶c̶ ̶i̶n̶s̶i̶g̶h̶t̶ ̶t̶h̶a̶t̶ ̶w̶o̶u̶l̶d̶ ̶l̶e̶a̶d̶ ̶t̶h̶e̶m̶ ̶t̶o̶w̶a̶r̶d̶s̶ ̶A̶G̶I̶.̶

̶I̶t̶ ̶m̶u̶s̶t̶ ̶b̶e̶ ̶n̶i̶c̶e̶ ̶u̶p̶ ̶t̶h̶e̶r̶e̶ ̶i̶n̶ ̶t̶h̶e̶ ̶i̶v̶o̶r̶y̶ ̶t̶o̶w̶e̶r̶ ̶o̶f̶ ̶a̶c̶a̶d̶e̶m̶i̶a̶.̶ I pity your students. Those who teach has a duty to encourage value creation and seeking out knowledge for its own sake, not constantly dangling a carrot in front of the student like leading a donkey. Don't gatekeep.

> I don't know how to address this because I'm not sure where I made this claim.

I am referring specifically to: I'm sure many here recognize the reference[0], but being able to make a model that performs successfully on a test set[1] is not always meaningful. For example, about a year ago I was working a very big tech firm and increased their model's capacity on customer data by over 200% with a model that performed worse on their "test set". No additional data was used, nor did I make any changes to the architecture. Figure that out without math. (note, I was able to predict poor generalization performance PRIOR to my changes and accurately predict my model's significantly higher generalization performance).

̶T̶h̶e̶r̶e̶ ̶a̶r̶e̶ ̶m̶a̶n̶y̶ ̶w̶a̶y̶s̶ ̶t̶o̶ ̶t̶e̶s̶t̶ ̶c̶a̶u̶s̶a̶l̶i̶t̶y̶.̶ ̶T̶h̶e̶ ̶d̶a̶t̶a̶ ̶s̶c̶i̶e̶n̶c̶e̶/̶s̶t̶a̶t̶i̶s̶t̶i̶c̶ ̶w̶a̶y̶s̶ ̶a̶r̶e̶ ̶S̶p̶e̶a̶r̶m̶a̶n̶/̶P̶e̶a̶r̶s̶o̶n̶ ̶r̶a̶n̶k̶s̶ ̶a̶n̶d̶ ̶t̶ ̶t̶e̶s̶t̶s̶.̶ ̶T̶h̶o̶s̶e̶ ̶a̶r̶e̶ ̶g̶e̶n̶e̶r̶a̶l̶l̶y̶ ̶l̶i̶n̶e̶a̶r̶.̶ ̶ ̶h̶t̶t̶p̶s̶:̶/̶/̶l̶i̶n̶d̶e̶l̶o̶e̶v̶.̶g̶i̶t̶h̶u̶b̶.̶i̶o̶/̶t̶e̶s̶t̶s̶-̶a̶s̶-̶l̶i̶n̶e̶a̶r̶/̶ ̶ ̶A̶l̶t̶e̶r̶n̶a̶t̶i̶v̶e̶l̶y̶ ̶t̶h̶e̶r̶e̶ ̶a̶r̶e̶ ̶M̶L̶ ̶m̶e̶t̶h̶o̶d̶s̶ ̶l̶i̶k̶e̶ ̶g̶r̶a̶p̶h̶i̶c̶a̶l̶ ̶m̶o̶d̶e̶l̶s̶ ̶b̶u̶t̶ ̶I̶ ̶d̶o̶n̶'̶t̶ ̶t̶h̶i̶n̶k̶ ̶t̶h̶a̶t̶'̶s̶ ̶w̶h̶a̶t̶ ̶y̶o̶u̶ ̶a̶r̶e̶ ̶r̶e̶f̶e̶r̶r̶i̶n̶g̶ ̶t̶o̶ ̶h̶e̶r̶e̶.̶ ̶F̶o̶r̶ ̶d̶e̶e̶p̶ ̶l̶e̶a̶r̶n̶i̶n̶g̶ ̶s̶p̶e̶c̶i̶f̶i̶c̶a̶l̶l̶y̶ ̶t̶h̶e̶r̶e̶ ̶a̶r̶e̶ ̶t̶r̶i̶c̶k̶s̶ ̶w̶i̶t̶h̶ ̶s̶a̶m̶p̶l̶i̶n̶g̶ ̶t̶h̶a̶t̶ ̶y̶o̶u̶ ̶c̶a̶n̶ ̶u̶s̶e̶ ̶t̶o̶ ̶e̶y̶e̶b̶a̶l̶l̶ ̶t̶h̶i̶n̶g̶s̶,̶ ̶g̶u̶i̶d̶e̶d̶ ̶b̶y̶ ̶i̶n̶t̶u̶i̶t̶i̶o̶n̶.̶ ̶ ̶H̶e̶r̶e̶'̶s̶ ̶a̶ ̶g̶o̶o̶d̶ ̶r̶e̶f̶e̶r̶e̶n̶c̶e̶ ̶o̶f̶ ̶w̶h̶a̶t̶ ̶I̶ ̶m̶e̶a̶n̶:̶ ̶ ̶h̶t̶t̶p̶s̶:̶/̶/̶m̶a̶t̶h̶e̶u̶s̶f̶a̶c̶u̶r̶e̶.̶g̶i̶t̶h̶u̶b̶.̶i̶o̶/̶p̶y̶t̶h̶o̶n̶-̶c̶a̶u̶s̶a̶l̶i̶t̶y̶-̶h̶a̶n̶d̶b̶o̶o̶k̶/̶l̶a̶n̶d̶i̶n̶g̶-̶p̶a̶g̶e̶.̶h̶t̶m̶l̶ ̶h̶t̶t̶p̶s̶:̶/̶/̶a̶r̶x̶i̶v̶.̶o̶r̶g̶/̶a̶b̶s̶/̶2̶3̶0̶5̶.̶1̶8̶7̶9̶3̶ ̶ ̶A̶g̶a̶i̶n̶ ̶m̶o̶r̶e̶ ̶o̶f̶ ̶t̶h̶e̶s̶e̶ ̶a̶r̶e̶ ̶e̶m̶p̶i̶r̶i̶c̶a̶l̶ ̶c̶o̶m̶m̶o̶n̶ ̶s̶e̶n̶s̶e̶.̶ No need for mathematical maturity or any grasp of discrete mathematics.

> Maybe I should have been clearer by adding a line that most people do not take calculus in high school. While it is offered there, these are the advance courses, and I'd be wary of being so pejorative. I know a large number of great mathematicians, computer scientists, and physicists who did not take calculus in high school. I don't think we need to discourage anyone or needlessly make them feel dumb. I'd rather encourage more to undertake further math education and I believe the lessons learned from calculus are highly beneficial in real world every day usage, without requiring explicit formula writing (as referenced in my prior post).

Okay fair you have a point. I forgot not all schools offer AP classes and advanced mathematics.

I believe we both share the view that education is important, but disagree on how much mathematical understanding is truly necessary to apply or advance ML. I suppose we will have to agree to disagree.

godelski · 2024-08-06T04:10:56 1722917456

  > but disagree on how much mathematical understanding is truly necessary to apply or advance ML

We do not disagree on this point. I have been explicitly clear about this and stated it several times. And this is the last instance I will do so.

We do disagree on one thing, but it isn't about math, science, or ML. If you would like to have a real conversation, I would be happy to. But it is required that you respond in good faith and more carefully read what I've written. I expect you to respect my time as much as I've respected yours.

You should be prod of your credentials and the work you've accomplished. I intimately understand the hard work it takes to achieve each one of those things, but I don't want to have a pissing contest or try to diminish yours. You should be proud of them. But if you want to take your anger out on someone, I suggest going elsewhere. HN is not the place for that and I personally will have none of it.

Onavo · 2024-08-07T05:57:49 1723010269

Good man, world needs more people like you.