Hacker Newsnew | past | comments | ask | show | jobs | submitlogin





Like I already said, the model can remember stuff as long as it’s in the context. LLMs can obviously remember stuff they were told or output themselves, even a few messages later.

AGI needs to genuinely learn and build new knowledge from experience, not just generate creative outputs based on what it has already seen.

LLMs might look “creative” but they are just remixing patterns from their training data and what is in the prompt. They cant actually update themselves or remember new things after training as there is no ongoing feedback loop.

This is why you can’t send an LLM to medical school and expect it to truly “graduate”. It cannot acquire or integrate new knowledge from real-world experience the way a human can.

Without a learning feedback loop, these models are unable to interact meaningfully with a changing reality or fulfill the expectation from an AGI: Contribute to new science and technology.


I agree that this is kind of true with a plain chat interface, but I don’t think that’s an inherent limit of an LLM. I think OpenAI actually has a memory feature where the LLM can specify data it wants to save and can then access later. I don’t see why this in principle wouldn’t be enough for the LLM to learn new data as time goes on. All possible counter arguments seem related to scale (of memory and context size), not the principle itself.

Basically, I wouldn’t say that an LLM can never become AGI due to its architecture. I also am not saying that LLM will become AGI (I have no clue), but I don’t think the architecture itself makes it impossible.


LLMs lack mechanisms for persistent memory, causal world modeling, and self-referential planning. Their transformer architecture is static and fundamentally constrains dynamic reasoning and adaptive learning. All core requirements for AGI.

So yeah, AGI is impossible with today LLMs. But at least we got to watch Sam Altman and Mira Murati drop their voices an octave onstage and announce “a new dawn of intelligence” every quarter. Remember Sam Altman 7 trillion?

Now that the AGI party is over, its time to sell those NVDA shares and prepare for the crash. What a ride it was. I am grabbing the popcorn.


  > the model can remember stuff as long as it’s in the context.
You would need an infinite context or compression

Also you might be interested in this theorem

https://en.wikipedia.org/wiki/Data_processing_inequality


> You would need an infinite context or compression

Only if AGI would require infinite knowledge, which it doesn’t.


You're right, but compounding effects get out of hand pretty quickly. There's a certain point where finite is not meaningfully different than infinite and that threshold is a lot lower than you're accounting for. There's only so much compression you can do, so even if that new information is not that large it'll be huge in no time. Compounding functions are a whole lot of fun... try running something super small like only 10GB of new information a day and see how quickly that grows. You're in the TB range before you're half way into the year...

This seems kind of irrelevant? Humans have General Intelligence while having a context window of, what, 5MB, to be generous. Model weights only need to contain the capacity for abstract reasoning and querying relevant information. That they currently hold real-world information at all is kind of an artifact of how models are trained.

  > Humans have General Intelligence while having a context window
Yes, but humans also have more than a context window. They also have more than memory (weights). There's a lot of things humans have besides memory. For example, human brains are not a static architecture. New neurons as well as pathways (including between existing neurons) are formed and destroyed all the time. This doesn't stop either, it continues happening throughout life.

I think your argument makes sense, but is over simplifying the human brain. I think once we start considering the complexity then this no longer makes sense. It is also why a lot of AGI research is focused on things like "test time learning" or "active learning", not to mention many other areas including dynamic architectures.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: