Hacker Newsnew | past | comments | ask | show | jobs | submit | threetonesun's commentslogin

It's also a standard right handed strat, which seems like an oversight for a guy famous for playing with a right handed strat flipped upside down.

I am 100% sure that AI with guardrails will become the dominant models as they become more widely adopted, and the bigger issue you should be concerned with is can you even tell what those guardrails are.

You can't and that is the danger. These tools are one of many to drive "right-think" at scale, which is against the users knowledge and wishes.

Seeing a Substack email collection box where you have to agree to whatever its terms are to subscribe with a skip to content link of "No, I'm a coward" is... an experience. I'll take your word he's an excellent writer, if there's an RSS feed maybe I'll subscribe.

Oh, I just edited it with developer tools to "No thank you, and I'm brave" so that clicking it wouldn't turn me into a coward

Most Substacks have an RSS feed (I'm not sure if one can disable it or not); in this case: https://samkriss.substack.com/feed

This is painful to me on three levels: 1. Real estate costs have gone up so much it’s prohibitively expensive to do something this grand. 2. Advertising is now a race to the bottom where showing car ads on websites has almost zero cost with all return compared to something novel like this. 3. It’s impossible to find a car like a 90s Miata these days because manual transmissions are almost dead and every car had to get heavier to have enough safety features to survive being T-boned by a Cybertruck.

Agree on the rest, but thankfully for #3 a modern base ND Miata with the 1.5 is pretty close to in weight to a NA due to a lot of weight saving work by Mazda.

I mainly use Siri for cooking timers, I really enjoyed the brief period of time where it started flipping 50 minutes and 15 minutes. And then went back, for some reason, but not after I started using things like 14 minutes and 59 seconds or 51 minutes to make it think just a little harder.


My Siri just forgets to confirm the timer. So I go to find my phone swearing at it just to figure out it did set a timer, but for the wrong time. It simply didn't tell me.

Yeah, if you have these tools in place to validate it's changes you can quickly iterate with it to the right results. But think through how it's making UI changes and it becomes obvious quickly why it can make absolutely wrong and terrible guesses about the implementation details, it can't _see_ what it's doing, or interact with it, it's just pattern matching other implementations its seen.


Yea, the next breakthrough for Codex or Claude Code would be to actually use/test the app like a real human would during the development process.


Here's a document produced by Claude Code using my Showboat testing tool this morning to help explore SeaweedFS (a local S3 clone) - it includes trying things out with curl and getting screenshots from Chrome using my Rodney tool: https://github.com/simonw/research/blob/main/seaweedfs-testi...


I rarely see LLMs generate code that is less readable than the rest of the codebase it's been created for. I've seen humans who are short on time or economic incentive produce some truly unreadable code.

Of more concern to me is that when it's unleashed on the ephemera of coding (Jira tickets, bug reports, update logs) it generates so much noise you need another AI to summarize it for you.


The main coding agent failure modes I've seen:

- Proliferation of utils/helpers when there are already ones defined in the codebase. Particularly a problem for larger codebases

- Tests with bad mocks and bail-outs due to missing things in the agent's runtime environment ("I see that X isn't available, let me just stub around that...")

- Overly defensive off-happy-path handling, returning null or the semantic "empty" response when the correct behavior is to throw an exception that will be properly handled somewhere up the call chain

- Locally optimal design choices with very little "thought" given to ownership or separation of concerns

All of these can pretty quickly turn into a maintainability problem if you aren't keeping a close eye on things. But broadly I agree that line-per-line frontier LLM code is generally better than what humans write and miles better than what a stressed-out human developer with a short deadline usually produces.


Oh god, the bad mocks are the worst. Try adding instructions not to make mocks and it creates "placeholders", ask it to not create mocks or placeholders and it creates "stubs". Drives me mad...

To add to this list:

- Duplicate functions when you've asked for a slight change of functionality (eg. write_to_database and write_to_database_with_cache), never actually updating all the calls to the old function so you have a split codebase.

- On a similar vein, the backup code path of "else: do a stupid static default" instead of erroring, which would be much more helpful for debugging.

- Strong desires to follow architecture choices it was trained on, regardless of instruction. It might have been trained on some presumably high quality, large and enterprise-y codebases, but I'm just trying to write a short little throwaway program which doesn't need the complexity. KISS seems anathema to coding agents.


I'm sort of happy to see all these things I run into listed out as issues people have so I know it's not just me experiencing and being bothered by these behaviors.


All of these bother me, but the null/default-value returns drive me insane. It makes the code more verbose and difficult to follow, and in many cases makes the code force its way through problems that should be making it stop. Please, LLM, please just throw an exception!


I keep Inbox zero, mostly, using this system. If I haven't read it, how important could it have been, CTRL+A, DEL gets you to zero.


Someone probably briefly thought they brought Skynet online via AI powered drones.


Well, it was a heritage thing. The original remakes with the center speedo and plenty of physical buttons were fun. The digital circle thing is an abomination.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: