Hacker Newsnew | past | comments | ask | show | jobs | submit | ej88's commentslogin

Thanks Max! This was a really interesting article and closely matches my own experience with how the agents have been progressing

one of the takeaways I get when reading skilled engineers' experiences with these tools is that they essentially offer leverage, and the more skill someone already has the higher their ceiling will be


The only reason why have an ad budget is because buying ads was effective. If they were no longer effective there would be no ad budget

i feel similarly. suppose ai makes people more productive:

1. companies that are not doing well (slow growth, losing to competition etc) or are in a monopoly and are under pressure to save in the short term are going to use the added productivity to reduce their opex

2. companies that are doing well (growth, in competitive markets) will get even more work done and can't hire enough people

my hunch is block is not doing as well as they seem to be


obviously he's going to posture his company as growing and doing well, but clearly not enough for the board and shareholders given their headcount growth from zirp

some companies are in the position to go for moonshots and block hasn't panned out


trimodal swe compensation (elite, big tech, everyone else) extends to the job markets too

They generalized "the market." I know a lot of out of work SWEs right now.

is there any data about the overall job market on whether it's been good or bad? genuinely curious the most recent data point shows a rebound https://www.citadelsecurities.com/news-and-insights/2026-glo...

and fwiw i dont know any swes struggling to find work personally

swe is so broad and in bubbles its hard to get an objective analysis


Software development jobs are up 10%. Jobs in general are down 6%

https://x.com/perborgen/status/2025890393166917857


I collect data from "Who wants to be hired" threads. This month is one of the highest in years.

You'd think "past year" would include a full 12 months. This person has chosen a ~10 month period to hide the large drop off in early 2025 as you can see here: https://www.citadelsecurities.com/news-and-insights/2026-glo...

They did not, you get the same date range and the same graph shape going to FRED and pressing the "1Y" option, and the series includes the first two months of 2026 so it's 12 months: https://fred.stlouisfed.org/graph/?g=1SGzm

However, the chart settings were actually modified to hide/deemphasize the earlier decline: the the index date was changed. 2025-02-20=100 in their graph, default of 2020-02-01=100 would have the chart start at 64 and rise to 71.44.


Job openings are down, but total jobs are up.

https://fred.stlouisfed.org/series/PAYEMS


Their graph shows a rebound to early-mid 2024 levels which is promising but still a relatively bad job market

i guess it depends on what you define as bad and what that threshold is

https://trueup.io/job-trend

this tracker shows continuous improvement since 2023


Sure, I assumed status quo everyone is talking about is basically the several years before that graph. I still think it's relatively bad compared to that despite the modest improvement.

What's not shown in a graph of job postings is the demand side. With all the layoffs, out of work college grads, people staying put in jobs they are unhappy with, etc., I'd wager that demand per job is still at a historically high level compared to what we have been accustomed to


> write an article dismissing ai

> usage is copy pasting code back and forth with gemini

the jokes write themselves


That's the most recent time. But I've bounced around all the LLMs - they're all superficially amazing. But if you understand their output they often wrong in both subtle and catastrophic ways.

As I said, maybe I'm wrong. I hope you have fun using them.


Have you tried a coding agent such as claude code or codex?

Yes. And, again, they look amazing and make you feel like you're 10x.

But then I look at the code quality, hideous mistakes, blatant footguns, and misunderstood requirements and realise it is all a sham.

I know, I know. I'm holding it wrong. I need to use another model. I have to write a different Soul.md. I need to have true faith. Just one more pull on the slot machine, that'll fix it.


"CEO Dario Amodei predicted last March that in six months AI would be writing 90% of code, and when that didn’t happen"

I mean, a lot of developers have 90% of their code being written by AI (myself and my friends at the labs included). Obviously YMMV depending on your codebase and individual skill.

"Software engineers will at times overestimate their capabilities, as demonstrated by the METR study that found that developers believed they were 24% faster when using LLMs, when in fact coding models made them 19% slower. This, naturally, makes them quite defensive of the products they use, and whether or not they’re actually seeing improvements."

I wonder what he thinks about the new METR update that showed a net speedup as a lower bound (due to participants literally not wanting to even tackle tasks with AI due to how slow it would be), with the returning devs having the greatest improvements in speedup?

"for one of Anthropic’s greatest lies: that AI can “work uninterrupted” for periods of time, leaving the reader or listener to fill in the (unsaid) gap of “...and actually create useful stuff.”"

We're probably at the beginning of the S curve for long-running tasks that create useful stuff (https://ladybird.org/posts/adopting-rust/) but it clearly needs hand-holding and a way to self-verify work.

"No amount of DarioMath about how a model “costs this much and makes this much revenue” changes the fact that profitability is when a company makes more money than it spends."

Feels like he's being dishonest here because the economics of the labs are unique (and precarious). Each model (revenue - cost to train and serve) is profitable. Labs invest in the next model to maintain their advantage, otherwise people will stop using their latest models. This probably doesn't go on in perpetuity (which is what Ed should've analyzed more). To his benefit, he's right that CC subscriptions are currently being subsidized.

[Insert quotes of Dario saying models will be smarter than most humans or Nobel laureates]

I mean, he's not wrong in certain definitions of "smart". They're already well above the average human in terms of testable world knowledge, math, coding, science, etc... but obviously fall short in other ways compared to humans.


not enough people look at the slope, just the coords

Really interesting updates to their 2025 experiment.

Repeat devs from the original experiment went from 0-40% slowdown to now -10-40% speedup - and METR estimates this as a 'lower-bound'

more devs saying they dont even want to do 50% of their work without AI, even for 50/hr

30-50% of devs decided not to submit certain tasks without AI, missing the tasks with the highest uplift

it also seems like there is a skill gap - repeat devs from the first study are more productive with ai tools than newly recruited ones with variable experience

overall it seems like the high preference for devs to use AI is actually hurting METR's ability to judge their speedup, due to a refusal to do tasks without it. imo this is indirectly quite supportive for ai coding's productivity claims.


The finding of the first study was people cannot judge their performance with these tools. So I don’t think the lack of individuals not willing to work without them is indicative of productivity improvements. I think it’s indicative of them being enjoyable to use.

It was claimed to find that, but I don't think it did. It compared developers' beliefs about average speed up across tasks, measured by asking them once at the end, compared to the average comparative speed measured per task and then averaged. That's measuring two different things, and all kinds of things could mass up developers' fuzzy recollection of the gestalt of several tasks (such as recency bias and question/study framing) that wouldn't effect it if you asked them right after; moreover, when tasks were broken down by task type, the speed up/slow down results actually matched developers' qualitative comments.

There are some people participating in the study who will fire & forget instructions to Claude/Codex running in parallel worktrees, but would really struggle if they were required to work on their project without AI assistance.

So while some study participants probably are seeing an actual speedup because of the discipline with which they manage their codebase's structure & documentation, other study participants are actually getting worse at non-AI coding.

...and METR's study can't tell which is which because METR's study isn't using any sort of codebase quality metrics for grounding.


surprised nobody responded with the most straightforward, occams razor explanation

they think what they're doing is actually good for society

not everyone is in the hackerspace libertarian / socialist sphere

i used to work for a place that used persona despite it adding extra friction to signups (literally resulting in less paying customers to the dismay of PMs) because it was worth it to combat fraud. theres a tradeoff in everything


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: