There is some ability for it to make novel connections but it's pretty small. You can see this yourself having it build novel systems.
It largely cannot imaginr anything beyond the usual but there is a small part that it can. This is similar to in context learning, it's weak but it is there.
It would be incredible if meta learning/continual learning found a way to train exactly for novel learning path. But that's literally AGI so maybe 20yrs from now? Or never..
You can see this on CL benchmarks. There is SOME signal but it's crazy low. When I was traing CL models i found that signal was in the single % points. Some could easily argue it was zero but I really do believe there is a very small amount in there.
This is also why any novel work or findings is done via MASSIVE compute budgets. They find RL enviroments that can extract that small amount out. Is it random chance? Maybe, hard to say.
Is this so different from what we see in humans? Most people do not think very creatively. They apply what they know in situations they are familiar with. In unfamiliar situations they don't know what to do and often fail to come up with novel solutions. Or maybe in areas where they are very experienced they will come up with something incrementally better than before. But occasionally a very exceptional person makes a profound connection or leap to a new understanding.
The difference is AI tooling lies to you. Day 0 you think it's perfect but the more you use ai tools you realize using them wrong can give you gnarly bugs.
It took me a couple of days to find the right level of detail to prompt it. Too high level, and the codebase gets away from me/the tooling goes off the rails. Too low level, and I may as well do it myself. Maybe also learn the sorts of things Claude Code isn't good at yet. But once I got in the groove it was very easy from there. I think the whole process took 2-3 days.
Claude code feels like the first commodity agent. In theory its simple but in practice you'll have to maintain a ton of random crap you get no value in maintaining.
My guess is eventually all "agents" will be wipped out by claude code or something equivalent.
Maybe not the companies will die but that all those startups will just be hooking up a generic agent wrapper and let it do its thing directly. My bet is that that the company that would win this is the one with the most training data to tune their agent to use their harness correctly.
No, GPT 5.x are very unlike GPT4.5. GPT 5.x are much more censored and second-guessing what you "really meant".
When it comes to conversation, Gemini 3 Pro right now is the closest.
When I asked it to make a nightmare Sauron would show me in Palantir, and ChatGPT5.2 Thinking tried to make it "playful" (directly against my instructions) and went with some shallow but safe option. Gemini 3 Pro prepared something much deeper and more profound.
I don't know nearly as much about talking with Opus 4.5 - while I use it for coding daily, I don't use it as a go-to chat. As a side note, Opus 3 has a similar vibe to GPT 4.5.
That grumpy guy is using an LLM and debugging with it. Solves the problem. AI provider fine tunes their model with this. You now have his input baked into it's response.
How you think these things work? It's either a human direct input it's remembering or a RL enviroment made by a human to solve the problem you are working on.
Nothing in it is "made up" it's just a resolution problem which will only get better over time.
I think the bigger point we should realize is LLMs offer the EXACT same thing in a better way. Many people are still sharing answers to problems but they do it through an AI which then fine tunes on it and now that problem solution is shared with EVERYONE.
I mean this kinda implies that there's a chance it could fail but failure basically is no worse than doing nothing?
Most of those examples were failed or problematic countries before and after US intervention. If there's a chance of sucess that's better than doing nothing no?
Not sure how to interpret that as almost imminent.