Hacker News new | past | comments | ask | show | jobs | submit login

Humans don't improve by "thinking." They improve my natural selection against a fitness function. If that fitness function is "doing better at math" then over a long time perhaps humans will get better at math.

These models don't evolve like they, there is not a random process of architectural evolution. Nor is there a fitness function anything like "get better at math."

A system like AlphaZero works because it has a rules to use as an oracle: the game rules. The game rules provide the new training information needed drive the process. Each game played produces new correct training data.

These LLMs have no such oracle. Their fitness function is and remains: predict the next word, followed by: produce text that makes a human happy. Note that it's not "produce text that makes ChatGPT happy."






it's more complicated than this. I mean what you get is defined by what you put in. At first is was random or selected internet garbage + books + docs. I.e. not designed for training. Than was tuning. Now we can use trained model to generate the data designed for training. With specific qualities, in this case reasoning. And train next model. Just intuitively it can be smaller and better at what we trained it for. I showed two options how data can be generated, there are others of course.

As for humans, assuming genetically they have the same intellectual abilities, you can see the difference in development of different groups. It's mostly defined by training the better next generation. Schools are exactly for this.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: