Hacker Newsnew | past | comments | ask | show | jobs | submit | ryanjshaw's commentslogin

Until a year ago I believed as the author did. Then LLMs got to the point where they sit in meetings like I do, make notes like I do, have a memory like I do, and their context window is expanding.

Only issue I saw after a month of building something complex from scratch with Opus 4.6 is poor adherence to high-level design principles and consistency. This can be solved with expert guardrails, I believe.

It won’t be long before AI employees are going to join daily standup and deliver work alongside the team with other users in the org not even realizing or caring that it’s an AI “staff member”.

It won’t be much longer after that when they will start to tech lead those same teams.


The closer you get to releasing software, the less useful LLMs become. They tend to go into loops of 'Fixed it!' without having fixed anything.

In my opinion, attempting to hold the hand of the LLM via prompts in English for the 'last mile' to production ready code runs into the fundamental problem of ambiguity of natural languages.

From my experience, those developers that believe LLMs are good enough for production are either building systems that are not critical (e.g. 80% is correct enough), or they do not have the experience to be able to detect how LLM generated code would fail in production beyond the 'happy path'.


> The closer you get to releasing software, the less useful LLMs become.

Which is _always_ the case with these things, honestly. Remember Ruby on Rails? Make a Twitter clone in half an hour by just writing some DSL! Of course, in reality Rails was _not_ a productivity revolution, and making _real_ software which had to be operated at scale and maintained, and work properly, in it wasn't much easier than it had been previously.


The amount of "apps" I've had dumped on my team that are everything from un-releasable to deployed on some random shit-cloud we haven't approved (vercel comes up a lot). If you needed hand holding to release things or had to throw software over the fence to others to "productionise" etc then you probably don't know what you're talking about.

This is not my experience with claude code. It does forget big picture things but if you scope your changes well it’s fine.

I would estimate that out of every 200 lines of code that Claude Code produces, I notice at least 1 issue that would cause severe problems in production.

In my opinion these discussions should include MREs (minimal reproducible examples) in the form of prompts to ground the discussion.

For example, take this prompt and put it into Claude Code, can you see the problematic ways it is handling transactions?

---

The invoicing system is being merged into the core system that uses Postgres as its database. The core system has a table for users with columns user_id, username, creation_date . The invoicing data is available in a json file with columns user_id, invoice_id, amount, description.

The data is too big to fit in memory.

Your role is to create a Python program that creates a table for the invoices in Postgres and then inserts the data from the json file. Users will be accessing the system while the invoices are being inserted.

---


And that's why you ask for a high level plan for something like that before you let the agent write any code. Then you review the plan for flaws, revise it, and prompt the system to fill out more details for each step. Repeat as necessary. Yes it's slow, but it's the best way of using this "glorified autocomplete" to ease and speed up real work.

People that have never written their own code won't know what the flaws are.

Those people can ask Claude to review the flaws for them.

Then they won't know if it's accurate or missing something.

Oh good point.

What he’s saying is split this up into multiple tasks to create the table, insert the data etc

Isn’t that the hard part? If the tasks are small enough and well defined, where’s the win over just writing the code right there and then?

Well claude can also refine it into smaller tasks and that’s where you can fix those major problems in production issues.

It’s the hard part which is why these tools are so great, the writing of code was the tedious part

You can use an LLM to generate that list of tasks.

And how does a new grad that's never actually programmed know whether that list of tasks makes sense?

Yes, but knowing how to scope your changes requires a lot of expertise.

After 2 years of using all of these tools (Claude C, Gemini cli, opencode with all models available) I can tell you it is a huge enabler, but you have to provide these "expert guardrails" by monitoring every single deliverable.

For someone who is able to design an end to end system by themselves these tools offer a big time saving, but they come with dangers too.

Yesterday I had a mid dev in my team proudly present a Web tool he "wrote" in python (to be run on local host) that runs kubectl in the background and presents things like versions of images running in various namespaces etc. It looked very slick, I can already imagine the product managers asking for it to be put on the network.

So what's the problem? For one, no threading whatsoever, no auth, all queries run in a single thread and on and on. A maintenance nightmare waiting to happen. That is a risk of a person that knows something, but not enough building tools by themselves.


Yup. I’m not expert so maybe I’m completely off base, but if I were OpenAI or Anthropic I’d likely just hire 1000 highly skilled engineers across multiple disciplines, tell them to build something in their domain of expertise, then critique the model’s output, iteratively work on guardrails for a month or two until the model one-shots the problem, and package that into the new release.

That's exactly what they are doing via dataannotation.tech and other services.

Any comments on how the copyright issues are handled in corporate settings? I mean both in terms of staying clear of lawsuit+ ensuring what we produce remains safe from copying

I can take a verbal description from a meeting with five to ten people and put together something they can interact with in two weeks. That is a lot slower than Claude Code! Yet everywhere I’ve worked, this is more than fast enough.

Over two more weeks I can work with those same five to ten people (who often disagree or have different goals) and get a first draft of a feature or small, targeted product together. In those latter two weeks, writing code isn’t what takes time; working through what people think they mean verses what they are actually saying, mediating one group of them to another when they disagree (or mostly agree) is the work. And then, after that, we introduce a customer. Along the way I learn to become something of an expert in whatever the thing is and continue to grow the product, handing chunks of responsibility to other developers at which point it turns into a real thing.

I work with AI tooling and leverage AI as part of products, where it makes sense. There are parts of this cycle where it is helpful and time saving, but it certainly can’t replace me. It can speed up coding in the first version but, today, I end up going back and rewriting chunks and, so far, that eats up the wins. The middle bit it clearly can’t do, and even at the end when changes are more directed it tends toward weirdly complicated solutions that aren’t really practical.


> poor adherence to high-level design principles and consistency. This can be solved with expert guardrails, I believe.

That’s a bit… handwavy…!


I've been hearing this for several years. How much longer is "it won't be long"?

I've heard the same "it won't be long" from UML and 4GL - until the industry finally gave up. Both of those are still used a lot in industry and they do well in their place, but nobody pretends they will ever be everything to everyone anymore.

And in 1990 people were complaining about the same thing [1].

[1] Why Aren’t Operating Systems Getting Faster As Fast as Hardware? https://web.stanford.edu/~ouster/cgi-bin/papers/osfaster.pdf


You can also argue that posting with a real name encourages you to reflect on your identity.

Or do both. Also post anonymously to see what kind of a person you are when masked, and compare.


Seems like they ought to be dedicated security teams monitoring for exactly this: does a key to X give users access to not-X. Even more bizarre is their VDP team not immediately understanding the severity of the issue.

And slow down the time to ship things? The shareholders wouldn't like that.

Those poor poor institutional shareholders…

They do have dedicated teams for exactly these sorts of concerns. They are also swamped with projects and so they can't review big new changes overnight. Google is very likely shipping first and asking questions later.

"Don't worry, we have Gemini looking at this very issue right now for all teams"

"I know, I'm reading along!"

That's how you slow down development to a crawl

Yeah, lets just start building a house and don't wait for architects to finish the blueprints :) They just slowing us down with all that thinking things through stuff.

I don't see a problem with this. The problem with "move fast and break things" isn't the moving fast part, it's the trail of broken things that no one bothers to fix. When those broken things affect people's wallets, that's when we have problems.

That's fine. Right is better than now.

I’ve settled on the idea that my job as parent is to introduce my kid to a bunch of different things, help them process that information, but ultimately the decision of where they choose to focus their energy is up to them. I’m proud of whatever they do, as long as they try their best.

That’s a good choice. I just feel each kid has so much potentials in them, but there are always something, like a shortcoming in the genes or a bad characteristics that prevents them from achieving a lot more.

And I don’t have the time, will and experience to guide mine.


Could have been worse. Anybody remember that story where the keycard readers would randomly work and eventually it was discovered the log file had grown huge and was being appended by reading the whole thing into memory over the network, appending the line, and writing the whole thing back out again, thus creating what the random pattern because I guess it would sometimes time out?

I found it helped me to read it out loud in a pirate voice.


Fun fact: stereotypical "pirate speech" is actually a relic of the English West Country dialect.

Not so much a relic as it was West County actor Robert Newton putting on an exaggerated accent in his depections of Long John Silver and Blackbeard in several films of the 1950s. His depictions were extremely influential on later pirates in film.

Kind of ironic, using a UT in a comment about undefined behavior.


The link you posted says:

> It's against our Community Standards to maintain more than one personal account.

> Bear in mind that a personal profile is for non-commercial use and represents an individual person.

Really unclear to me what a business is supposed to do. Not a particularly useful link.


Set up a business page from a personal account. That has always been Facebook's stance on this issue. Like, for a decade or more.

And before someone says "it's a bad policy..." yeah it is. But I don't think one gets to complain about how AI ruined their business when the whole business practice is built upon violating their platform's policy.


The style of writing and strange segways suggest mental illness; the blog description seems to confirm it.


It is incredible how much energy is being put into justifying why this is his own fault?

I guess this is the only way people with high salaries or wealth in the US can find peace with themselves - maybe that's the mental illness?


What's incredible to me are comments, like yours, that are saying that somehow the system failed this person, just based on this blog post. When in fact:

1. He was given food and shelter, which he declined - most of his comments about the food are that it's too sugary.

2. He makes it sound like he was offered more permanent shelter in Feb of 2025, which he also declined.

To be clear, I'm not making a judgment about this person - and, for that matter, the comment you replied to didn't seem to be making a judgment either, just stating a reasonable conclusion that the author suffered from mental illness.

So I'd like to know what additional resources you think would have changed this person's circumstances?


> I'm not making a judgment about this person [...] just stating a reasonable conclusion that the author suffered from mental illness.

You're not making a judgment, but are somehow able to diagnose mental illness from a blog post? Wild.

It's far easier to conclude that the system has failed this person, as it habitually fails millions of people in the US, in these same circumstances. Are they all mentally ill?

And even if mental health is an issue, does that mean that they are somehow less worthy of assistance? That it's OK for a human being to live under a bridge?

The level of self-righteousness and lack of empathy in your comments is baffling.


> You're not making a judgment, but are somehow able to diagnose mental illness from a blog post? Wild.

The author literally talks about his "psychosis" that led him to his predicament, so no, I don't think it's a judgment, just the ability to read.

> And even if mental health is an issue, does that mean that they are somehow less worthy of assistance?

I have no idea where you seem to get this idea. My point is that, at least in this author's case, he has received assistance. He discusses 2 separate instances in this post where he declined resources and support - he is living under a bridge because he specifically rejected the shelter that was offered him.

Could the process of getting people support be better? Of course, but his experience dealing with the pains of bureaucracy doesn't seem much different that bureaucratic slights we all, rich and poor, have to deal with when trying to get assistance from government. FWIW I especially liked his "I identify as a woman" comment in order to get a shower.

> The level of self-righteousness and lack of empathy in your comments is baffling.

The only self righteousness I see on display is your belief that everyone else is so uncaring because we don't necessarily think the government should force this person into a shelter. I'm not judging the author at all, but I'm certainly judging you.


Do you genuinely think that people who with mental issues ought to live in shelters as a dingitifies existence?

I think I can extend me predicament about mental illness to you also.


> Currently constructing the Sanctuary of the Silent Star while unpacking a 6-month journey through psychosis, homelessness, and the systems that govern us. One story at a time.

I'm not saying it's his own fault. I'm also not too happy when people point to mental illness. But this is his blog description where he mentions himself that he unpacs a 6-month journey through psychosis.

6 months of psychosis means you're mentally ill with psychosis as a symptom.


Doesn’t change anything. In civilized world, people who are mentally ill are took care of by the society, not thrown in the streets.


He wasn't "thrown into the streets". He was offered food and shelter, and he declined it.


He was respecting the laws of California. What he did was like a speeding ticket.

I do agree that if he was truly in need he would have stolen an iPhone or some designer purses.

How privileged to steal sleeping bags and food.


> I needed a guide. I stole Don Quixote from the library.

That wasn't any kind of necessity. He lost me there.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: