Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Author here:

For what it's worth I do actually use the tools albeit incredibly intentionally and sparingly.

I see quite a few workflows and tasks that they can be a value add on, mostly outside of the hotpath of actual code generation but still quite enticing. So much so in fact I'm working on my own local agentic tool with some self hosted ollama models. I like to think that i am at least somewhat in the know on the capabilities and failure points of the latest LLM tooling.

That however doesn't change my thoughts on trying to ascertain if code submitted to me deserves a full indepth review or if I can maybe cut a few corners here and there.



> That however doesn't change my thoughts on trying to ascertain if code submitted to me deserves a full indepth review or if I can maybe cut a few corners here and there.

How would you even know? Seriously, if I use Chatgpt to generate a one-off function for a feature I'm working on that searches all classes for one that inherits a specific interface and attribute, are you saying you'd be able to spot the difference?

And what does it even matter it works?

What if I use Bolt to generate a quick screen for a PoC? Or use Claude to create a print-preview with CSS of a 30 page Medicare form? Or converting a component's styles MUI to tailwind? What if all these things are correct?

This whole OS repos will ban LLM-generated code is a bit absurd.

> or what it's worth I do actually use the tools albeit incredibly intentionally and sparingly.

How sparingly? Enough to see how it's constantly improving?


> How would you even know? Seriously, if I use Chatgpt to generate a one-off function for a feature I'm working on that searches all classes for one that inherits a specific interface and attribute, are you saying you'd be able to spot the difference?

I don't know, thats the problem. As a result, because I can't know I have to now do full in depth reviews no matter what. Which is the "judging" I tongue in cheek talk about in the blog.

> How sparingly? Enough to see how it's constantly improving?

Nearly daily, to be honest I have not noticed too much improvement year over year in regards to how they fail. They still break in the exact same dumb ways now as they did before. Sure they might generate correct syntactic code reliably now and it might even work. But they still consistently fail to grok the underlying reasoning for things existing.

But I am writing my own versions of these agentic systems to use for some rote menial stuff.


So you werent doing in depth reviews before? Are these people you know? And now you just don't trust them because they include a tool on their workflow?




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: