Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

ML is fundamentally pattern finding and matching.

The fact that it is useful for finding patterns that may be different than humans tend to find is not an indication of understanding of the underlying data.

It is no different than clustering in traditional stats. While those found patterns are sometimes incredibly useful, clustering knows nothing outside of the provided dataset.

As other have mentioned, Google's search results are actually really bad at finding novel results these days due to many factors like battling SEO tricks etc...

But while the results of LLMs is impressive, there is no mechanisms for it to have an 'internal model of the world' in their current form.

It may help to remember that current LLMs would require an infinity of RAM to be even computationally complete right now.



> The fact that it is useful for finding patterns that may be different than humans tend to find is not an indication of understanding of the underlying data.

Without invoking your own self-awareness as an argument, how do you know that other people "understand" stuff, and aren't merely "finding patterns"? In other words, in what way do you define "understanding", such that you can be sure that LLMs have no such thing?

> there is no mechanisms for it to have an 'internal model of the world' in their current form.

How do you know that? We don't even know why humans have an internal model of the world. What if internal modelling of the world is just sufficiently-complex pattern-matching?


If clustering has worked on what amounts to basically the entire world of information things get a bit fuzzy though. I don't suppose you are technically incorrect, it's just that these words lose practical meaning when we talk about models that encode tens or even hundreds of billions of parameters.

Predicting the "next token" requires an "internal model of the world". It might not be how we do it, but without something that acts like it I'd be very interested in how you think it comes up with its predictions.

Let's say it needs to continue a short story about a detective. The detective says at the end: "[...] I have seen every clue and thought of every scenario. I will tell you who the killer is:". Good luck continuing that with any sort of accuracy if you don't have some abstract map of how "people" act. You can see how I can think of a lot of examples that require something that acts as a model of the "world".

There's a definite structure and pattern to everything we do. This (to an LLM) hidden context gives rise to the words we write. To re-invent them, like it has to do, it must basically conjure up all this hidden state. I'm not saying it gets it right, I'm just saying that there is no other way than to model the world behind the text to even get into ballpark-right territory.


>It may help to remember that current LLMs would require an infinity of RAM to be even computationally complete right now.

Anything that is computationally compute needs an infinite amount of RAM. This is not unique to LLMs or even to machine learning.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: