Hacker Newsnew | past | comments | ask | show | jobs | submit | nharada's commentslogin

Another interesting thing here is that the gap between "burned out but just producing subpar work" and "so crispy I literally cannot work" is even wider with AI. The bar for just firing off prompts is low, but the mental effort required to know the right prompts to ask and then validate is much higher so you just skip that part. You can work for months doing terrible work and then eventually the entire codebase collapses.

+1

First, I agree with most commentators that they should just offer 3 modes of visibility: "default", "high", "verbose" or whatever

But I'm with you that this mode of working where you watch the agent work in real-time seems like it will be outdated soon. Even if we're not quite there, we've all seen how quickly these models improve. Last year I was saying Cursor was better because it allowed me to better understand every single change. I'm not really saying that anymore.


This is awesome and I'm really happy to see this progress. Landing a new chemistry in a production car THIS YEAR is some crazy velocity, especially compared to where other Na-Ion batteries are in the development cycle elsewhere. Is anyone else even close to having a car on the road with their cells?

The reason this is so exciting for me personally is for stationary energy. Because the raw materials are so abundant and have good cold weather performance, both grid and home level energy storage costs should come down significantly as this is commercialized further.


That's a massive jump, I'm curious if there's a materially different feeling in how it works or if we're starting to reach the point of benchmark saturation. If the benchmark is good then 10 points should be a big improvement in capability...


> worry that the US will fall behind the curve

Man it's already over. It's hard to imagine the US autos EVER catching up at this point, even with state support.


I'm obviously biased, and I probably have more gripes than most about Waymo as a corporate entity, but the premise this article seems to be based on is "Waymo is a zombie company who will never release a real product" or something similar?

They seem to be scaling just fine. Here in SF they're ubiquitous and most people I know use them regularly (and usually prefer them to rideshare). Sure, it's not the type of growth possible with pure software, but they're doing 500k rides/week and are looking to be doing 1MM/week by the end of the year. What scale does this business need to be for the author to consider them a real company?


I think the assumption is valid. Most of the reasoning components of the next gen (and some current gen) robotics will use VLMs to some extent. Deciding if a temporary construction sign is valid seems to fall under this use case.


But unless you are using a single, end-to-end model for the entire driving stack, that "proceed" command will never influence accelerator pedal.

Sure, there will be a VLM for reading the signs, but the worst it'd be able to output is things like "there is a "detour" sign at (123, 456) pointing to road #987" - and some other, likley non-LLM, mechanism will ensure that following that road is actually safe.


Not a "proceed" command but they can influence the accelerator. I had a dodge ram van that would constantly decelerate on cruise control due to reading road signs. The signs in some states like California for trucks towing trailers are 55 mph but the speed limit would be 65 or 70 mph. The cruise control would detect the sign and suddenly decelerate to 55.


That's an example of things working as expected - the sign recognition system is very limited, in that it can only return road sign information. So it can _ask_ cruise control system to change the speed, but it's up to cruise control to decide if it's safe to obey the request or not. For example, I am pretty sure it'll never raise the speed, no mater what sign recognition system says.


In that case working at a startup would be a thing someone would only do as a last resort, and the talent pool would consequently be extremely low quality. Sounds damaging to the scene to me.


Have you seen the tech market?

A lot of good engineers are out of work. They'll gladly take what they can


It definitely feels like a jump in capability. I've found that the long term quality of the codebase doesn't take nosedive nearly as quickly as earlier agentic models. If anything it's about steady or maybe even increasing if you prompt it correctly and ask for "cleanup PRs"


The general playbook in the US seems to be "do something that's bad for society but makes money, and eventually the people in charge will get enough FOMO that it'll be officially sanctioned as long as they get a cut".


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: