The post is long and complicated and I haven't read most of it, so whether it's ...

xanderlewis · on Feb 4, 2024

You’re spot on; it’s like saying you can understand the game of chess by simply reading the rules. In a certain very superficial sense, yes. But the universe isn’t so simple. The same reason even a perfect understanding of what goes on at the level of subatomic particles isn’t thought to be enough to say we ‘understand the universe’. A hell of a lot can happen in between the setting out of some basic rules and the end — much higher level — result.

danielmarkbruce · on Feb 4, 2024

And yet...alpha zero.

xanderlewis · on Feb 4, 2024

My entire point is that implementation isn’t sufficient for understanding. Alpha Zero is the perfect example of that; you can create an amazing chess playing machine and (potentially) learn nothing at all about how to play chess.

…so what’s your point? I’m not getting it from those two words.

danielmarkbruce · on Feb 5, 2024

Understanding how the machine plays or how you should play? They aren't the same thing. And that is the point - trying to analogize to some explicit, concrete function you can describe is backwards. These models are gigantic (even the 'small' ones), they are looking to minimize a loss function by looking in multi thousand dimensional space. It is the very opposite of something that fits in a human brain in any explicit fashion.

gjm11 · on Feb 5, 2024

So is what happens in an actual literal human brain.

And yet, we spend quite a lot of our time thinking about what human brains do, and sometimes it's pretty useful.

For a lot of this, we treat the actual brain as a black box and don't particularly care about how it does what it does, but knowing something about the internal workings at various levels of abstraction is useful too.

Similarly, if for whatever reason you are interested in, or spend some of your time interacting with, transformer-based language models, then you might want some intuition for what they do and how.

You'll never fit the whole thing in your brain. That's why you want simplified abstracted versions of it. Which, AIUI, is one thing that the OP is trying to do. (As I said before, I don't know how well it does it; what I'm objecting to is the idea that trying to do this is a waste of time because the only thing there is to know is that the model does what the code says it does.)

danielmarkbruce · on Feb 5, 2024

Sure, good abstractions are good. But bad abstractions are worse than none. Think of all the nonsense abstractions about the weather before people understood and could simulate the underlying process. No one in modern weather forecasting suggests there is a way to understand that process at some high level of abstraction. Understand the low level, run the calcs.

naasking · on Feb 5, 2024

> Understanding how the machine plays or how you should play? They aren't the same thing.

And yet, seeing Alpha Zero play has indeed led to new human chess strategies.

FeepingCreature · on Feb 5, 2024

Alpha Zero didn't read the rules, it trained within the universe of the rules for 44 million games.

xanderlewis · on Feb 6, 2024

...in fact, one could argue that not only did it not read the rules — it has no conception of rules whatsoever.

danielmarkbruce · on Feb 4, 2024

It is very promising. In fact, in industry there are jokes about how getting rid of linguists has helped language modeling.

Trying to understand it at some level of abstraction that humans can fit in their head has been a dead end.

knightoffaith · on Feb 5, 2024

Trying to build systems top-down using principles humans can fit in their head has arguably been a dead end. But this doesn't mean that we cannot try to understand parts of current AI systems at a higher level of abstraction, right? They may not have been designed top-down with human-understandable principles, but that doesn't mean that trained, human-understandable principles couldn't have emerged organically from the training process.

Evolution optimized the human brain to do things over an unbelievably long period of time. Human brains were not designed top-down with human-understandable principles. But neuroscientists, cognitive scientists, and psychologists have arguably had success with understanding the brain partially at a higher level of abstraction than just neurons, or just saying "evolution optimized these clumps of matter for spreading genes; there's nothing more to say". What do you think is the relevant difference between the human brain and current machine learning models that makes the latter just utterly incomprehensible at any higher level of abstraction, but the former worth pursuing by means of different scientific fields?

danielmarkbruce · on Feb 5, 2024

I don't know neuroscience at all, so I don't know if that's a good analogy. I'll make a guess though - if you consider a standard RAG application. That's a system which uses at least a couple models. A person might reasonably say "the embeddings in the db are where the system stores memories. The LLM acts as the part of the brain that reasons over whatever is in working memory plus it's sort of implicit knowledge." I'd argue that's reasonable. But systems and models are different things.

People use many abstractions in AI/ML. Just look at all the functionality you get in PyTorch as an example. But they are abstractions of pieces of a model, or pieces of the training process etc. They aren't abstractions of the function the model is trying to learn.

knightoffaith · on Feb 5, 2024

Right, I've used pytorch before. I'm just trying to understand why the question of "how does a transformer work?" is only meaningfully answered by describing the mechanisms of self-attention layers at the highest level of abstraction, with any higher level of abstraction being nonsense. More specifically, why we should have a ban on any higher level of abstraction in this scenario when we can answer the question of "how does the human mind work?" at not just the atom level, but also the neuroscientific level or psychological level. Presumably you could say the same thing about this question: The human mind is a bunch of atoms obeying the laws of physics. That's what it's doing. It's not something else.

I understand you're emphasizing the point that the connectionist paradigm has had a lot more empirical success than the computationalist paradigm - letting AI systems learn organically, bottom-up is more effective than trying to impose human mind-like principles top-down when we design them. But I don't understand why this means understanding bottom-up systems at higher level of abstractions is necessarily impossible when we have a clear example of a bottom-up system that we've had some success in understanding at a high level of abstraction, viz. the human mind.

danielmarkbruce · on Feb 5, 2024

It would be great if they were good, but they seem to be bad, it seems that they must be bad given the dimensionality of the space, and humans latch onto simple explanations even when they are bad.

Think about MoE models. Each expert learns to be good at completing certain types of inputs. It sounds like a great explanation for how it works. Except, it doesn't seem to actually work that way. The mixtral paper showed that the activated routes seemed to follow basically no pattern. Maybe if they trained it differently it would? Who knows. It certainly isn't a good name regardless.

Many fields/things can be understood at higher and higher levels of abstraction. Computer science is full of good high level abstractions. Humans love it. It doesn't work everywhere.

knightoffaith · on Feb 5, 2024

Right, of course we should validate explanations based on empirical data. We rejected the idea that there was a particular neuron that activated only when you saw your grandmother (the "grandmother neuron") after experimentation. But just because explanations have been bad, doesn't mean that all future explanations must also be bad. Shouldn't we evaluate explanations on a case-by-case basis instead of dismissing them as impossible? Aren't we better off having evaluated the intuitive explanation for mixtures of experts instead of dismissing them a priori? There's a whole field - mechanistic interpretability - where researchers are working on this kind of thing. Do you think that they simply haven't realized that the models they're working on interpreting are operating in a high-dimensional space?

danielmarkbruce · on Feb 5, 2024

Mechanistic interpretability studies a bunch of things though. Like, the mixtral paper where they show the routing activations is mechanistic interpretability. That sort of feature visualization stuff is good. I don't know what % of the field is spending their time on trying to interpret the models in a way that involves higher level, human can explain, approximating the following code type work though? I'm certainly not the only one who thinks it's a waste of time, I don't believe anything I've said in this thread is original in any way.

I... don't know if the people involved in that specific stuff have really grokked they are working in high dimensional space? A lot of otherwise smart people work in macroeconomics, where for decades they haven't really made any progress because it's so complex. It seems stupid to suggest a whole field of smart people don't realize what they are up against, but sheesh it kinda seems that way doesn't it? Maybe I'll be eating my words in 10 years.

knightoffaith · on Feb 5, 2024

They certainly understand they're working in a high dimensional space. No question. What they deny is that this necessarily means the goal of interpretability is a futile one.

But the main thrust of what I'm saying is that we shouldn't be dismissing explanations a priori - answers to "how does a transformer work?" that go beyond descriptions of self-attention aren't necessarily nonsensical. You can think it's a waste of time (...frankly, I kind of think it's a waste of time too...), but just like any other field, it's not really fair to close our eyes and ears and dismiss proposals out of hand. I suppose > Maybe I'll be eating my words in 10 years. indicates you understand this though.