I predict that the future of LLM's when it comes to coding and software creation...

kristjansson · 2025-01-29T19:11:53 1738177913

Most people really do not know what they want at any level of detail.

travoc · 2025-01-29T20:57:58 1738184278

It's ok, they'll know it when they see it. Keep trying.

jrsdav · 2025-01-29T19:37:35 1738179455

I have been trying to imagine something similar, but without all the middleware/distribution layer. You need to do a thing? The LLM just does it and presents the user with the desired experience. Kind of upending the notion that we need "apps" in the first place. It's all materialized, just-in-time style.

ClumsyPilot · 2025-01-29T18:58:31 1738177111

> Imagine telling an AI agent … requirements… asks for your input on how things should work, clarifying questions etc.

That’s hard work. I watch people do that every day, and always get something wrong.

Also what about deploying the application, paying for database or cloud resource that will run it, etc?

jacobsenscott · 2025-01-29T19:04:12 1738177452

What's it called when you describe an app with sufficient detail that a computer can carry out the processes you want? Where will the record of those clarifying questions and updates be kept? What if one developer asks the AI to surreptitiously round off pennies and put those pennies into their bank account? Where will that change be recorded, will humans be able to recognize it? What if two developers give it conflicting instructions? Who's reviewing this stream of instructions to the LLM?

"AI" driven programming has a long way to go before it is just a better code completion.

repelsteeltje · 2025-01-29T19:23:16 1738178596

That.

Plus coding (producing a working program that fits some requirement) is the least interesting part of software development. It adds complexity, bugs and maintenance.

throw310822 · 2025-01-29T23:20:48 1738192848

> What's it called when you describe an app with sufficient detail that a computer can carry out the processes you want?

You're wrong here. The entire point is that these are not computers as we used to think of them. These things have common sense; they can analyse a problem including all the implicit aspects, suggest and evaluate different implementation methods, architectures, interfaces.

So the right question is: "what's it called when you describe an app to a development team and they ask back questions and come back with designs and discuss them with you, and finally present you with an mvp, and then you iterate on that?"

Vampiero · 2025-01-30T00:07:31 1738195651

Bold of you to imply that GPT asks questions instead of making baseless assumptions every 5 words, even when you explicitly instruct it to ask questions if it doesn't know. When it constantly hallucinates command line arguments and library methods instead of reading the fucking manual.

It's like outsourcing your project to [country where programmers are cheap]. You can't expect quality. Deep down you're actually amazed that the project builds at all. But it doesn't take much to reveal that it's just a facade for a generous serving of spaghetti and bugs.

And refactoring the project into something that won't crumble in 6 months requires more time than just redoing the project from scratch, because the technical debt is obscenely high, because those programmers were awful, and because no one, not even them, understands the code or wants to be the one who has to reverse engineer it.

Except that AI is actually MUCH more expensive!

throw310822 · 2025-01-30T00:34:09 1738197249

Of course, but who's talking about today's tools? They're definitely not able to act like an independent, competent development team. Yet. But if we limit ourselves to the here-and-now, we might be like people talking about GPT3 five years ago: "yes it does spit out a few lines of code, which sometimes even compiles. When it doesn't forget half way and starts talking about unicorns".

We're talking about the tools of tomorrow, which, judging by the extremely rapid progress, I think is only a few (3-5) years away.

Anyway, I had great experiences with Claude and DeepSeek.

jumploops · 2025-01-29T18:50:31 1738176631

The future is bespoke software.

In some sense, this is how computers were always supposed to work!

IAmGraydon · 2025-01-29T21:55:19 1738187719

Most software is useful because a large number of people can interact with it or with each other over it. I'm not so certain that one-off software would be very useful for anyone beyond very simple functionality.

prmph · 2025-01-29T18:55:36 1738176936

This will almost certainly never materialize, and the reasons are not just technical

ForOldHack · 2025-01-30T23:38:12 1738280292

Marvin Minsky promised that an AI would have a PhD, by 1950, and 1960... we are no closer. sorry. We are faster, much faster, 100,000,000 times faster, by we are no closer.

aprilthird2021 · 2025-01-29T18:59:51 1738177191

> auto run your code, compile it, feed errors back to the LLM,

Can't wait for companies to juice profits by having the LLM run excessive cycles or get stuck in a loop and run up my bill

genewitch · 2025-01-29T21:11:13 1738185073

aider jams the backend on my PC, i have to kill the tcp connection or python to stop it running a GPU on the backend, from time to time. I can't imagine paying for tokens and not knowing if it's working or wasting money.

girvo · 2025-01-29T21:43:35 1738187015

The loops and constant useless changes drive me nuts haha

genewitch · 2025-02-07T09:53:02 1738921982

aider sucessfully made, 1-shot, a 2048 clone in architect mode, serverless, local html+js+css. i pushed the git repo it made to my github, aider2048clone. I used deepseek-r1-llama-70b distill, it took ~3 hours. after the first 10 minutes i didn't want to interrupt it, because who cares how long it takes if it works?

I haven't been able to get it to do anything but waste my tokens with deepseek itself as the backend (aider --model deepseek[/deepseek-reasoner|/deepseek-chat] i think but am not certain).

genewitch · 2025-02-02T02:50:32 1738464632

I think the architect mode might be worth looking at but i'm going to attempt to aider.exe $(*.txt) and then switch to /ask mode and see if it can be used as a 0-shot document query.

because even a rudimentary, garbage implementation would be fun to have, i think.

energy123 · 2025-01-30T08:10:25 1738224625

A little further out from that could be the LLM acting as the runtime environment. No code. It's just data in (user inputs etc) -> GUI out.

acchow · 2025-01-29T20:33:59 1738182839

Have you tried https://bolt.diy ?

It does what you describe

IAmGraydon · 2025-01-29T21:56:20 1738187780

It claims to do what he describes.

dboreham · 2025-01-29T18:22:40 1738174960

It doesn't need to write tests: it can just use the application and figure out if it works.

logicchains · 2025-01-29T18:37:08 1738175828

That's going to be much slower and more expensive than writing tests because image/video processing is slower and more expensive than writing tests. And because of lag in using the UI (and re-building the whole application from scratch after every change to test again).

drdeca · 2025-01-29T20:02:26 1738180946

Hm, what if instead of using video of the application…

Ok, so if one can have one program snoop on all the rendering calls made by another program, maybe there could be a way of training a common representation of “an image of an application” and “the rendering calls that are made when producing a frame of the display for the application”? Hopefully in a way that would be significantly smaller than the full image data.

If so, maybe rather than feeding in the video of the application, said representation could be applied to the rendering calls the application makes each frame, and this representation would be given as input as the model interacts with the application, rather than giving it the actual graphics?

But maybe this idea wouldn’t work at all, idk.

Like, I guess the rendering calls often involve image data in their arguments, and, you wouldn’t want to include the same images many time as the input to the encoding thing, as that would probably (or, I imagine) make it slower than just using the overall image of the application. I guess the calls are probably more pointing to the images in memory though, not putting an entire image on the stack.

I don’t know enough about low-level graphics programming to know if this idea of mine makes any sense.

achierius · 2025-01-30T23:58:34 1738281514

Yes, it would be significantly smaller, but it would look very different depending on your platform, GPU, driver version, etc. -- the model would essentially need to learn how to map "graphics APIs" (e.g. OpenGL, Vulkan, Metal, ...) to "render result" for every combination of API, driver version, and GPU, which I imagine would constitute a significant amount of overhead.

ClumsyPilot · 2025-01-29T18:53:11 1738176791

But it’s actually correct from a usability perspective

fragmede · 2025-01-29T19:06:11 1738177571

I mean, we're halfway there, with aider and open-interpreter, just give it a couple of years