More

blutoot · 2026-01-22T14:20:27 1769091627

There’s only one solution to this problem at this point. Make AI significantly less affordable and accessible. Raise the prices of Pro / Plus / max / ultra tiers, introduce time limits, especially for minors (like screen time) when the LLM can detect age better. This will be a win-win solution: (a) people will be forced to go back to “old ways” of doing whatever it is that AI was doing it for them, (b) we won’t need as many data-centers as the AI companies are projecting today.

xlbuttplug2 · 2026-01-22T18:09:33 1769105373

Or simply embrace ignorance. Why hold on to things you don't use? Accept AI as an extension to your brain, and let the now dormant parts atrophy.

Yes, you will be vulnerable should you lose access to AI at some point, but the same goes for a limb. You will adapt.

blutoot · 2026-01-18T00:00:45 1768694445

Coding/programming as a skill differentiator is most likely "dead" - software DEVELOPMENT will indeed live on but it will need a higher degree of well-roundedness and ownership (which also means leaner SRE/DevOps/PM/QA functions).

blutoot · 2026-01-17T22:35:14 1768689314

My message to the CTO of Honeycomb.io (who apparently wrote this post): please avoid getting philosophical and controversial to gin up curiosity about your AI platform. If you want to highlight the benefits of your platform then do so earnestly and objectively. Please don't mask marketing with an excoriation of a profession that has never been well-defined (or has always been defined to fit into an organization's political landscape for the most part). And you guys (like every other SRE/Ops platform) capitalized on that structural divide and deservedly got rich by selling licenses to these teams. I don't think you can come in now with this holier-than-thou best practice messaging just because platforms like yours have zero moat in this post-CC/Codex world.

Hence my vitriol: https://news.ycombinator.com/item?id=46662287.

TacticalCoder · 2026-01-17T22:45:42 1768689942

> id getting philosophical and controversial to gin up curiosity about your AI platform

Also: please could he please avoid doing it by illustrating his non-sense with graphs that are both childish and non-sensical?

maccard · 2026-01-17T23:11:38 1768691498

The CTO is a she.

blutoot · 2026-01-17T22:11:56 1768687916

At a scale, I don't see a net negative of AI merging "shit by itself" if the developer (or the agent) is ensuring sufficient e2e, integration and unit test coverage prior to every merge, if in return I get my team to crank out features at a 10x speed.

The reality is that probably 99.9999% of code bases on this earth (but this might drop soon, who knows) pre-date LLMs and organizing them in a way that coding agents can produce consistent results from sprint to sprint, will need a big plumbing work from all dev teams. And that will include refactoring, documentation improvements, building consensus on architectures and of course reshaping the testing landscape. So SWE's will have a lot of dirty work to do before we reach the aforementioned "scale".

However, a lot of platforms are being built from ground-up today in a post-CC (claude code) era . And they should be ready to hit that scale today.

dsifry · 2026-01-17T22:23:28 1768688608

Yup! Software engineers aren't going to be out of work anytime soon, but I'm acting more like a CTO or VPE with a team of agents now, rather than just a single dev with a smart intern.

dizhn · 2026-01-18T00:53:53 1768697633

I am not in the tech field anymore and I use exclusively free models and clis. They are mostly of Chinese origin. I call them my little software sweatshop.

tadfisher · 2026-01-17T23:22:54 1768692174

I hate this paradigm because it pits me against my tools as if we're adversaries. The tools are prone to rewrite or even delete the tests, so we have to write other tools to sandbox agents from each other and check each others' work, and I just don't see a way to get deterministically good results over just building shit myself. It comes down to needing high trust in my tools to feel confident in what we're shipping.

blutoot · 2026-01-17T23:45:36 1768693536

The key is that at the end of the day productivity is king which is a polite term for cutting head count and/or delivering at a ridiculously higher velocity.

You can deterministically always get good results at your pace. But most likely, you won't achieve that at the speed and scale that a coding agent running in 4-5 worktrees, 24/7 without food or toilet breaks, especially if the latter will mostly help achieve the product/business goals at an "OK" quality (in which case you will perhaps be measured by how good you can steer these agents to elevate that quality from "OK" without sacrificing scale too much).

blutoot · 2026-01-17T21:48:30 1768686510

I think the opposite will happen - leadership will forego this attitude of "reverse course on the first outage".

Teams will figure out how to mitigate such situations in future without sacrificing the potential upside of "fully autonomous code changes made on production systems" (e.g invest more in a production-like env for test coverage).

Software engineering purists have to get out of some of these religious beliefs

verdverm · 2026-01-17T23:51:47 1768693907

> Software engineering purists have to get out of some of these religious belief

To me, the Claude superfans like yourself are the religious, like how you run around poffering unsubstantiated claims like this and believe in / anthropomorphize way too much. Is it because Anthrop'ic is an abbreviation of Anthropomorphic?

blutoot · 2026-01-18T00:55:48 1768697748

I would be in the skeptics' camp 3-4 months ago. Opus-4.5 and GPT-5.2 have changed my mind. I'm not talking about mere code completion. I am talking about these models AND the corresponding agents playing a really really capable software engineer + tester + SRE/Ops role.

The caveat is that we have to be fairly good at steering them in the right direction, as things stand today. It is exhaustive to do it the right way.

verdverm · 2026-01-18T01:52:22 1768701142

I agree the latest Gen of models, Opus 4.5 and Gemini 3 are more capable. 5.2 is OpenAI squeezing as much as they can out of 4 because they haven't had a successful pre training run since Ilya left

I disagree that they are really really capable engineers et al. They have moments where they shine like one. They also have moments where they perform worse than a new grad/hire. This is not what a really really capable engineer looks like. I don't see this fundamental changing, even with all the improvements we are seeing. It's lower level and more core than something adding more layers on top can resolve, that a only addresses best it can

throwaway7783 · 2026-01-18T00:19:01 1768695541

In my own anecdotal experience Claude Code found a bug in production faster than I could. I was the author of the said code, that was written 4 years ago by hand. GPs claim perhaps is not all that unsubstantiated. My role is moving more towards QA/PM nowadays.

verdverm · 2026-01-18T00:31:13 1768696273

I have many wins with Ai, I also have many fail hards. This experience helps me understand where their limits are

Do you have fail hards to share along with your wins? Are we going to only share our wins like stonk hussies?

throwaway7783 · 2026-01-18T01:17:25 1768699045

For sure. Not hard fails, but bad fixes. It confidently thought it fixed a bug, but it really didn't. I could only tell (it was fairly complex), because I tried reproducing it before/after. Ultimately I believe there was not sufficient context provided to it. It has certainly failed to do what I asked it to do in round 1, round 2, but eventually got it right (a rendering issue for a barcode designer).

These incidents have been less and less over the last year - switching it Opus made failure frequencies less. Same thing for code reviews. Most of it is fluff, but it does give useful feedback, if the instructions are good. For example, I asked for a blind code review of a PR ("Review this PR"), and it gave some generic commentary. I made the prompt more specific ("Follow the API changes across modules and see impact") - it found a serious bug.

The number of times I had to give up in frustration has been going down over the last one year. So I tend believe a swarm of agents could do a decent job of autonomous development/maintenance over the next few years.

jauntywundrkind · 2026-01-18T02:29:32 1768703372

Even lesser agents are incredibly good and incredibly fast using tools to inspect the system & come up with ideas for things to check, and checking them. I absolutely agree: we will 100% give the agents far more power. A browser, a debugger for the server that works with that browser instance, a database tool, a opentelemtry tool.

The teams are going to figure out how to mitigate bad deploys by using even more AI & giving it even better information gathering.

majormajor · 2026-01-18T04:47:19 1768711639

Leadership will do what customers demand, which in most cases won't be ship-constantly-and-just-mitigate.

How to find problems through testing before they happen is a decades-long unsolved problem, sadly.

blutoot · 2026-01-17T21:36:04 1768685764

It's simple -- the more high-minded and snobbish the developer class will be (thus extracting the highest salaries in the world) and as long as they will continue to maintain this unreal amount of gatekeeping, the more the non-developer community (especially those at the leadership-level) will continue to revel at the prospect of eliminating developers from the value chain.

-warren · 2026-01-17T21:39:26 1768685966

I think you're onto something. Replace "developers" with "doctors" I that statement and you've described healthcare in the mid 1900s. Replace with "masons" and we describe the medieval times. There is always a specialized class

blutoot · 2026-01-17T21:29:21 1768685361

I can't wait for indie developers to build super-agents that commoditize providers like Honeycomb.io and more importantly clone all their features and offer them up for free as OSS.

verdverm · 2026-01-18T00:27:50 1768696070

Sounds like you don't know what a nightmare of version compat and bespokeness ops/obv is. This is going to be one of the harder things for LLMs to do because everyone is running on some snowflake held together with duct tape

blutoot · 2026-01-18T02:15:56 1768702556

Fair point - my statement is more about stealing market for simpler integrations by undercutting them on price.

And I don't want to trivialize the reality of enterprise platforms where bespoke connectors rule. I have dealt with migrations of platforms that are business critical and managing version compatibility and ensuring none of the integrations regressed was par for the course. I am not even saying that that makes me qualified to replicate Honeycomb.io. But I do think someone with a deep technical background in building observability platforms armed with Claude Code or Codex and armed with the right set of MCP's and all the necessary tooling should be able to build a clone of Honeycomb.uio.

Maybe it won't be a fast turnaround like a typical vibe-coded project but even if it is a month-long project to even get to 60% feature parity. these vendors will have to sit up and pay attention.

verdverm · 2026-01-18T07:46:22 1768722382

> And I don't want to trivialize...

as you immediately trivialize something it seems you know very little about

MCPs are outdated btw, it's bad to attach a bunch of MCPs in with your messages, pollutes the context. If you don't do this, you can build agents that are better than copilot/codex on gemini-3-flash. Claude Code is probably the leader here, but still definitely not capable of what you it is

hahahahhaah · 2026-01-18T10:12:24 1768731144

I assume then you are retired or not a programmer as you are wishing for the last bastions of comoanies that pay programmers to melt with the ice sheets, leaving the desert of no paid coding work.

blutoot · 2026-01-17T17:30:20 1768671020

I thought hooks are always fired if you use it as a PreToolUse event. Wouldn’t that work for the GitHub action tools from the GitHub mcp?

dsifry · 2026-01-17T22:36:03 1768689363

Just to be clear - the hook is deterministic, but the subagent running with an mcp server loaded is not - and for medium/large PRs, it can run out of context window or just forget what it is trying to do and get lazy and say 'Everything is good, ready to merge!' when in fact tests are failing or there are still unaddressed PR comments.

dsifry · 2026-01-17T22:19:56 1768688396

Sure, but that mcp still missed actionable comments that are marked as Out of Scope or Outside the PR - and this doesn't require having the context window loss of having another mcp instantiated, either. Anyway, give gtg a competitive look against the mcp - you should be able to see the difference

blutoot · 2026-01-15T08:28:55 1768465735

I have dystonia which often stiffens my arms in a way that makes it impossible for me to type on a keyboard. TTS apps like SuperWhisper have proven to be very helpful for me in such situations. I am hoping to get a similar experience out of "Handy" (very apt maming from my perspective).

I do, however, wonder if there is a way all these TTS tools can get to the next level. The generated text should not be just a verbatim copy of what I just said, but depending on the context, it should elaborate. For example, if my cursor is actively inside an editor/IDE with some code, my coding-related verbal prompts should actually generate the right/desired code in that IDE.

Perhaps this is a bit of combining TTS with computer-use.

mritchie712 · 2026-01-15T11:10:03 1768475403

I made something called `ultraplan`. It's is a CLI tool that records multi-modal context (audio transcription via local Whisper, screenshots, clipboard content, etc.) into a timeline that AI agents like Claude Code can consume.

I have a claude skill `/record` that runs the CLI which starts a new recording. I debug, research, etc., then say "finito" (or choose your own stopword). It outputs a markdown file with your transcribed speech interleaved with screenshots and text that you copied. You can say other keywords like "marco" and it will take a screenshot hands-free.

When the session ends, claude reads the timeline (e.g. looks at screenshots) and gets to work.

I can clean it up and push to github if anyone would get use out of it.

mritchie712 · 2026-01-15T14:18:04 1768486684

https://github.com/definite-app/ultraplan

heliostatic · 2026-01-15T13:59:10 1768485550

Definitely interested in that!

mritchie712 · 2026-01-15T14:18:24 1768486704

Added link above!

wanderingmind · 2026-01-15T11:50:00 1768477800

Sounds interesting I would love to use it if you get a chance to push to github

mritchie712 · 2026-01-15T14:18:34 1768486714

https://github.com/definite-app/ultraplan

sipjca · 2026-01-15T08:50:28 1768467028

I totally agree with you and largely what you’re describing is one of the reasons I made Handy open source. I really want to see something like this and see someone go experiment with making it happen. I did hear some people playing with using some small local models (moondream, qwen) to get some more context of the computer itself

I initially had a ton of keyboard shortcuts in handy for myself when I had a broken finger and was in a cast. It let me play with the simplest form of this contextual thing, as shortcuts could effectively be mapped to certain apps with very clear uses cases

eddyg · 2026-01-15T10:00:11 1768471211

There’s lots of existing work on “coding by voice” long before LLMs were a thing. For example (from 2013): http://xahlee.info/emacs/emacs/using_voice_to_code.html and the associated HN discussion (“Using Voice to Code Faster than Keyboard”): https://news.ycombinator.com/item?id=6203805

There’s also more recent-ish research, like https://dl.acm.org/doi/fullHtml/10.1145/3571884.3597130

hasperdi · 2026-01-15T08:50:26 1768467026

What you said is possible by feeding the output of speech-to-text tools into an LLM. You can prompt the LLM to make sense of what you're trying to achieve and create sets of actions. With a CLI it’s trivial, you can have your verbal command translated into working shell commands. With a GUI it’s slightly more complicated because the LLM agent needs to know what you see on the screen, etc.

That CLI bit I mentioned earlier is already possible. For instance, on macOS there’s an app called MacWhisper that can send dictation output to an OpenAI‑compatible endpoint.

sipjca · 2026-01-15T08:51:07 1768467067

Handy can post process with LLMs too! It’s just currently hidden behind a debug menu as an alpha feature (ctrl/cmd+shift+d)

sanex · 2026-01-15T14:21:17 1768486877

I was just thinking about building something like this, looks like you beat me to the punch, I will have to try it out. I'm curious if you're able to give commands just as well as some wording you want cleaned up. I could see a model being confused between editting the command input into text to be inserted and responding to the command. Sorry if that's unclear, might be better if I just try it.

sipjca · 2026-01-16T00:14:38 1768522478

I’d just try it and fork handy if it doesn’t work how you want :)

blutoot · 2026-01-15T08:13:02 1768464782

Crashes on Tahoe 26.3 Betq 1 :(

sipjca · 2026-01-15T08:27:10 1768465630

Please send me a crash log!