Hardly “nothing else”. Two smoothies a day with 150g of oats blended in them will basically cover this. You’d still have plenty of room for other food.
But that's not what the study tested. The study showed that both calorie restriction, and calorie restriction combined with almost all calories from oats, reduced cholesterol; but that the effect was more durable for the latter case. No data was gathered on eating oats without calorie restriction in this study.
Level Six: knowledge on how to build products deteriorates, more high level thinking is outsourced to AI. AI are asked to simply put out several versions and possibilities of products and testers go through harvesting candidates that are the most usable and have the least bugs, good enough for production. It could take a long time or it could happen very quick.
Level Seven: no one even knows what software is anymore, they just pray to AI to solve their problems and hope for the best. Some priests occasionally do random stuff that seems to affect outcomes, but no one knows for sure.
Level Eight: so few people do any paid labor any more, and society failed to figure out any sort of distributive income system such as UBI, so increasing chronic and endemic poverty is slowly eating away at revenue generation from AI designed and coded products and services.
Small gimmicky computers seem to attract so much attention and people who can’t help themselves but buy it, play with it for a while, then toss it into a drawer and never use it again.
You are right though, ive loved tinkering especially some if the cool linux based handhelds but i always come back to mobile/tablet because my limiting resource is time and android/ios kinda just works.
A powerful-enough pocket computer that can run a "real" OS with good input is the holy grail. Specialized types (gaming platforms mostly) seem to be converging on a few specific designs, but full general-purpose computers with keyboards etc still haven't really produced a "good enough" model. I used my GPD Win 2 daily when I was traveling often and frequently found myself in situations where it wasn't convenient to carry or use a full laptop due to weight/size, and even that thumb keyboard is 10000x better than a touchscreen keyboard in termux etc. There's definitely a niche to be served by either a better design or reimagining of the interface.
It implies that the agents could only do this because they could regurgitate previous browsers from their training data.
Anyone who's watched a coding agent work will see why that's unlikely to be what's happening. If that's all they were doing, why did it take three days and thousands of changes and tool calls to get to a working result?
I also know that AI labs treat regurgitation of training data as a bug and invest a lot of effort into making it unlikely to happen.
I recommend avoiding the temptation to look at things like this and say "yeah, that's not impressive, it saw that in the training data already". It's not a useful mental model to hold.
But yes, with enough prodding they will eventually build you something that's been built before. Don't see why that's particularly impressive. It's in the training data.
Except if you spend quality time with coding agents you realize that's not actually true.
They're equally useful for novel tasks because they don't work by copying large scale patterns from their training data - the recent models can break down virtually any programming task to a bunch of functions and components and cobble together working code.
If you can clearly define the task, they can work towards a solution with you.
The main benefit of concepts already in the training data is that it lets you slack off on clearly defining the task. At that point it's not the model "cheating", it's you.
Good long lived software is not a bunch of functions and components cobbled together.
You need to see the big picture and visions of the future state in order to ensure what is being built will be able to grow and breathe into that. This requires an engineer. An agent doesn’t think much about the future, they think about right now.
This browser toy built by the agent, it has NO future. Once it has written the code, the story is over.
> Except if you spend quality time with coding agents you realize that's not actually true.
Agent engineering seems to be (from the outside!) converging on quality lived experience. Compared to Stone Age manual coding it’s less about technical arguments and more about intuition.
Vibes in short.
You can’t explain sex to someone who has not had sex.
Any interaction with tools is partly about intuition. It’s a difference of degree.
Actually good software that is suitable for mass adoption would go a long way to convincing a lot of people. This is just, yet another, proof-of-concept. Something which LLMs obviously can do, and which never seems to translate to real-world software people use. Parsing and rendering text is really not the hard part of building a browser, and there's no telling how closely the code mirrors existing open-source implementations if you aren't versed on the subject.
That said, I think some credit is due. This is still a nice weekend project as far as LLMs go, and I respect that you had a specific goal in mind (showing a better approach than Cursor's nonsense, that gets better results in less time with less cost) and achieved it quickly and decisively. It has not really changed my priors on LLMs in any way, though. If anything it just confirms them, particularly that the "agent swarm" stuff is a complete non-starter and demonstrates how ridiculous that avenue of hype is.
> Actually good software that is suitable for mass adoption would go a long way to convincing a lot of people.
Yeah, that's obviously a lot harder, but doable. I've built it for clients, since they pay me, but haven't launch/made public something of my own, where I could share the code, I guess might be useful next project now.
> This is just, yet another, proof-of-concept.
It's not even a PoC, it's a demonstration of how far off the mark Cursor are with their "experiment" where they were amazed by what "hundreds of agents" build for week(s).
> there's no telling how closely the code mirrors existing open-source implementations if you aren't versed on the subject