Hacker Newsnew | past | comments | ask | show | jobs | submit | drcode's commentslogin

sort of, except I think the future of llms will be to to have the llm try 5 separate attempts to create a fix in parallel, since llm time is cheaper than human time... and once you introduce this aspect into the workflow, you'll want to spin up multiple containers, and the benefits of the terminal aren't as strong anymore.


I feel like the better approach would be to throw away PRs when they're bad, edit your prompt, and then let the agent try again using the new prompt. Throwing lots of wasted compute at a problem seems like a luxury take on coding agents, as these agents can be really expensive.

So the process becomes: Read PR -> Find fundamental issues -> Update prompt to guide agent better -> Re-run agent.

Then your job becomes proof-reading and editing specification documents for changes, reviewing the result of the agent trying to implement that spec, and then iterating on it until it is good enough. This comes from the belief that better, more expensive, agents will usually produce better code than 5 cheaper agents running in parallel with some LLM judge to choose between or combine their outputs.


Who or what will review the 5 PRs (including their updates to automated tests)? If it's just yet another agent, do we need 5 of these reviews for each PR too?

In the end, you either concede control over 'details' and just trust the output or you spend the effort and validate results manually. Not saying either is bad.


If you can define your problem well then you can write tests up front. An ML person would call tests a "verifier". Verifiers let you pump compute into finding solutions.


I'm not sure we write good tests for this because we assume some kind of logic involved here. If you set a human to task to write a procedure to send a 'forgot password' email, I can be reasonably sure there's a limited number of things a human would do with the provided email address, because it takes time and effort to do more than you should.

However with an LLM I'm not so sure. So how will you write a test to validate this is done but also guarantee it doesn't add the email to a blacklist? A whitelist? A list of admin emails? Or the tens of other things you can do with an email within your system?


Will people be willing to make their full time job writing tests?


We’ll just have an LLM write the tests.

Now we can work on our passion projects and everything will just be LLMs talking to LLMs.


I hope sarcasm.


They probably won't. But it doesn't matter. Ultimately, we'll all end up doing manual labor, because that is the only thing we can do that the machines aren't already doing better than us, or about to be doing better than us. Such is the natural order of things.

By manual labor I specifically mean the kind where you have to mix precision with power, on the fly, in arbitrary terrain, where each task is effectively one-off. So not even making things - everything made at scale will be done in automated factories/workshops. Think constructing and maintaining those factories, in the "crawling down tight pipes with scewdriver in your teeth" sense.

And that's only mid-term; robotics may be lagging behind AI now, but it will eventually catch up.


As well, just because it pasts a test doesn't mean it doesn't do wonky, non-performant stuff. Or worse, side effects no one verified. Plenty often the LLM output will add new fields I didn't ask it to change as one example.



Having command line tools to spin up multiple containers and then to collect their results seems like it would be a pretty natural fit.



Why would spinning containers remove the benefits? Presumably there is a terminal too interacting with the containers.


Nah, if parallelism will help, it'll be abstracted away from the user.


Tmux?


I was kinda pissed when my local mall got a "barista robot", and it asks for a 20% tip when you swipe your card


Tipping has lost its meaning and it is simply a money grab these days in many establishments, as your experience demonstrates. Like tipping for food to go.

I only tip when I sit down and good service is actually provided.


Let me introduce you to Hard 2632, a device for 32 byte demos: https://xayax.net/hard2632/


damn, that's a crazy process- thanks for the video link


That's seems silly, it's not poisonous to talk about next token prediction if 90% of the training compute is still spent on training via next token prediction (as far as I am aware)


99% of evolution was spent on single cell organisms. Intelligence only took 0.1% of evolution's training compute.


Are you making a claim about evolution here?


What you just said means absolutely nothing and has no comparison to this topic. It’s nonsense. That is not how evolution works.


ok that's a fair point


I don’t really think that it is. Evolution is a random search, training a neural network is done with a gradient. The former is dependent on rare (and unexpected) events occurring, the latter is expected to converge in proportion to the volume of compute.


why do you think evolution is a random search? I thought evolutionary pressures, and the mechanisms like epigenetics make it something different than a random search.


Evolution is a highly parallel descent down the gradient. The gradient is provided by the environment (which includes lifeforms too), parallelism is achieved through reproduction, and descent is achieved through death.


The difference is that in machine learning the changes between iterations are themselves caused by the gradient, in evolution they are entirely random.

Evolution randomly generates changes and if they offer a breeding advantage they’ll become accepted. Machine learning directs the change towards a goal.

Machine learning is directed change, evolution is accepted change.


It's more efficient, but the end result is basically the same, especially considering that even if there's no noise in the optimization algorithm, there is still noise in the gradient information (consider some magical mechanism for adjusting behaviour of an animal after it's died before reproducing. There's going to be a lot of nudges one way or another for things like 'take a step to the right to dodge that boulder that fell on you').


> Machine learning is directed change, evolution is accepted change.

Either way, it rolls down the gradient. Evolution just measures the gradient implicitly, through parallel rejection sampling.


Evolution also has no "goal" other than fitness for reproduction. Training a neural network is done intentionally with an expected end result.


There's still a loss function, it's just an implicit, natural one, instead of artificially imposed (at least, until humans started doing selective breeding). The comparison isn't nonsense, but it's also not obvious that it's tremendously helpful (what parts and features of an LLM are analagous to what evolution figured out with single-celled organisms compares to multicellular life? I don't know if there's actually a correspondance there)


PSA: Most modern gyms have "autobelay" devices that let you climb on your own without a partner. This makes gym climbing a super fun and accessible exercise anyone, even beginners, can do by just showing up to a gym at your convenience.

(If you're a beginner you should still take the 1 hour class first and you will have to pass a belay test. And yes, if you can make the schedule work out with a friend so can belay each other, that's even more fun)


You still need to be careful. I'm an avid climber. Most autobelay accidents happen because people don't clip in properly. However for me the auto belay cable broke after catching me. Resulted in five minor spinal fractures.

So from my experience I would say at least Google what are the common auto belay manufacturers and only use gyms that have them. True Blue and Perfect Decent are the only auto belays I will touch now.


thanks, I'll investigate my local gym!

update: they use trueblue


Jesus, what do you mean the cable broke? The rope itself got cut? Even though the device didn't fail?

I'm really averse to the autobelay because I can't feel the "pull" of a human belayer, so this is a nightmare scenario for me.

Then again, I'm sure that the autobelay is safer than the average human, even so, except I really trust my belayer.


That sounds terrible, did you take any legal action?


I did. It's behind me now and more importantly I'm fully recovered mentally and physically.

I don't live in the states so it's not as dramatic legally as you may imagine.


My understanding is that our local climbing gym sees most of its non-bouldering accidents from people not clipping into autobelays before they start climbing.


Unfortunately, auto belays are also pretty terrible once you’re familiar with climbing - they pull on you and make harder climbing extremely awkward.


They lower the grade by cca 1 level by pulling you up, at least till 6a/6b in french scale. In higher levels I can imagine they also interfere with careful balance and body weight shifting training you away from actual skills, thats why I never saw them on anything harder than maybe 7b and even there it was like 1 or 2 routes in whole gym.

But for easy grades and cca beginners, if you lack a good partner for whatever reason, they are great IMHO.


The pull of an autobelay is negligible, surely. The cable is a bit annoying perhaps but the real problem is that the wall is like near vertical, completely flat. Super uninspiring in my opinion.


Most climbing gyms put auto belays only on flat or slabby ‘beginniner’ areas of the walls because most people using auto belays can’t do much on harder stuff - and also it’s kind of convenient to have your partner ‘take’/hold you on steep stuff sometimes.

Having uncontrolled (but slow) descents onto people’s heads probably also doesn’t help.


Ever seen a spotter help in a struggling bench presser with just a couple fingers?


I mean I can lift my entire bodyweight with just a couple of fingers, but that point aside, this isn't so strange. The bench presser stalls exactly when their muscles are just short overcoming gravity, any extra force--even a couple of fingers--will add upwards momentum. You're not often in this kind of stall condition when climbing, it is much more about leverage and transplanting force through the kinetic chain. Especially since we were discussing balance on typically lightly overhanging flat walls.


Fyi: autobelay is how most people deck and die in the gym (forgetting to clip)


Other than user error (forgetting to clip) are there any other negatives to auto-belay devices?

Climbing is unique among sports in that you have to trust a random person to keep you alive through the most common action within the sport (falling)

Given its rising popularity, the sport should be safer by default.


the madness of cutting hundreds of gears by hand, many with a 45 degree bevel, I couldn't even imagine.

madness!


> How is AI going to make its own chips and energy?

Pay naive humans take care of those things while it has to, then disassemble the atoms in their human bodies into raw materials for robots/datacenters once that is no longer necessary


I suppose there is an equilibrium, where sites that penalize these types of crawlers will also get less traffic from people reading ai citations, so for many sites the upsides of allowing it will be greater than the downsides.


It's a crazy accomplishment really, unimaginable how safe US commercial airplanes have been in the last two decades

So sad that streak finally ended


As someone working in aviation safety, this is heartbreaking and awful to watch. The efforts of CAST and ASIAS in reducing aviation safety accidents have been very successful, but of course we still have so much to do.


There have been incidents that were just saved by luck (eg losing door in flight). And Too many near misses.


Better a near miss than a catastrophic incident.


And in such an avoidable way, too.


Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: