It's always POC apps in js or python, or very small libraries in other popular l...

Herring · 2025-07-17T22:24:28 1752791068

I like them for refactoring and “explain this massive codebase please”. Basically polishing or investigating things that already work.

But I think we should expect the scope of LLM work to improve rapidly in the next few years.

https://metr.org/blog/2025-03-19-measuring-ai-ability-to-com...

oblio · 2025-07-18T07:27:43 1752823663

The bad news is that mostly, as far as we can see, that doubling of performance also requires (at least) doubling of resource usage, plus we're getting close to a point where planetary resources for doubling LLM resources are getting kind of low...

Herring · 2025-07-18T12:34:15 1752842055

This species is going extinct. I finally accepted that when my dad died rather than change his lifestyle, despite being warned 10000x. My mom survived a heart attack, saw what happened to my dad, still hasn't changed her lifestyle.

Aeolun · 2025-07-18T11:22:09 1752837729

Hmm, I got Claude Opus to build me a game in Rust. I don’t think it really counts as POC app any more at that point.

_se · 2025-07-18T18:00:44 1752861644

It absolutely counts as a POC app until it's production grade, deployed, being used by people, maintained over time, etc.

This doesn't mean that it's not useful, or that you shouldn't be happy with what the LLM built. I also had Claude Code build me a web app for my own personal use in Rust this week. It's very useful to me. But it is 100% of POC/MVP quality, and always will be, because the code that it created is abjectly awful and I would never be able to scale it into a real world service without rewriting 50+% of it.

ants_everywhere · 2025-07-17T23:12:24 1752793944

> They're amazing. But the moment you try to do the more "serious" work with them, it falls apart rapidly.

Sorry, but this is just not true.

I'm using agents with a totally idiosyncratic code base of Haskell + Bazel + Flutter. It's a stack that is so quirky and niche that even Google hasn't been able to make it work well despite all their developer talent and years of SWEs pushing for things like Haskell support internally.

With agents I'm easily 100x more productive than I would be otherwise.

I'm just starting on a C++ project, but I've already done at least 2 weeks worth of work in under a day.

iammrpayments · 2025-07-18T06:18:51 1752819531

I’m going to ask what I’ve asked the last person here who said they are “10-20x” more productive:

If you’re really that more productive, why don’t you quit your job and vibecode 10 ios apps (in your case that would be 50 to 100 proportionally)

Aeolun · 2025-07-18T11:24:03 1752837843

Because money? Even if you can quickly build them it’s pointless if you can’t sell them. And Claude cannot help with that.

_se · 2025-07-18T01:58:19 1752803899

Share the codebase and what you're doing or, I'm sorry, you're just another example of what I laid out above.

If you honestly believe that "agents" are making you better than Goole SWEs then you severely need to take a step back and reevaluate, because you are wrong.

artisin · 2025-07-18T13:22:05 1752844925

Hold the phone. So, Google, with its legions of summa cum laude engineers, can't make this stack work well, but your AI agent is nailing it into next week? Seriously, show me the way, so I too may find AI enlightenment.

cpursley · 2025-07-17T23:23:58 1752794638

What do you mean “with agents”?

ants_everywhere · 2025-07-17T23:52:04 1752796324

I've been using mainly gemini-cli and am starting to play around with claude code.

cpursley · 2025-07-18T00:46:40 1752799600

Are you referring to those as agents or do you mean spinning separate/multiple agents out of sessions on them?