The real insight buried in here is "build what programmers love and everyone will follow." If every user has an agent that can write code against your product, your API docs become your actual product. That's a massive shift.
I'm very much looking forward to this shift. It is SO MUCH more pro-consumer than the existing SaaS model. Right now every app feels like a walled garden, with broken UX, constant redesigns, enormous amounts of telemetry and user manipulation. It feels like every time I ask for programmatic access to SaaS tools in order to simplify a workflow, I get stuck in endless meetings with product managers trying to "understand my use case", even for products explicitly marketed to programmers.
Using agents that interact with APIs represents people being able to own their user experience more. Why not craft a frontend that behaves exactly the the way YOU want it to, tailor made for YOUR work, abstracting the set of products you are using and focusing only on the actual relevant bits of the work you are doing? Maybe a downside might be that there is more explicit metering of use in these products instead of the per-user licensing that is common today. But the upside is there is so much less scope for engagement-hacking, dark patterns, useless upselling, and so on.
> Right now every app feels like a walled garden, with broken UX, constant redesigns, enormous amounts of telemetry and user manipulation
OK, but: that's an economic situation.
> so much less scope for engagement-hacking, dark patterns, useless upselling, and so on.
Right, so there's less profit in it.
To me it seems this will make the market more adversarial, not less. Increasing amounts of effort will be expended to prevent LLMs interacting with your software or web pages. Or in some cases exploit the user's agentic LLM to make a bad decision on their behalf.
the "exploit the user's agentic LLM" angle is underappreciated imo. we already see prompt injection attacks in the wild -- hidden text on web pages that tells the agent to do things the user didn't ask for. now scale that to every e-commerce site, every SaaS onboarding flow, every comparison page.
it's basically SEO all over again but worse, because the attack surface is the user's own decision-making proxy. at least with google you could see the search results and decide yourself. when your agent just picks a vendor for you based on what it "found," the incentive to manipulate that process is enormous.
we're going to need something like a trust layer between agents and the services they interact with. otherwise it's just an arms race between agent-facing dark patterns and whatever defenses the model providers build in.
Maybe. Or maybe services will switch to charging per API call or whatever instead of monthly or per-seat. Who can predict the future?
I mean, services _could_ make it harder to use LLMs to interact with them, but if agents are popular enough they might see customers start to revolt over it.
This extends further than most people realize. If agents are the primary consumers of your product surface, then the entire discoverability layer shifts too. Right now Google indexes your marketing page -- soon the question is whether Claude or GPT can even find and correctly describe what your product does when a user asks.
We're already seeing this with search. Ask an LLM "what tools do X" and the answer depends heavily on structured data, citation patterns, and how well your docs/content map to the LLM's training. Companies with great API docs but zero presence in the training data just won't exist to these agents.
So it's not just "API docs = product" -- it's more like "machine-legible presence = existence." Which is a weird new SEO-like discipline that barely has a name yet.
The "start over in an hour" philosophy is underrated. I've been running my own infrastructure for years and the single most empowering thing isn't the setup, it's the peace of mind that you can just nuke it and spin up somewhere else.
Knowing that, I started looking at every SaaS subscription very differently.
At the lower or easier end, there’s your standard containerisation tools like Docker Compose or the Podman equivalents. Just move your compose files and zip the mount folders and you can move stuff easily enough.
Middle ground you’ve got stuff like Ansible for if you want to install things without containers, but still want it to be scripted. I don’t use these much since they feel like the worst of both worlds.
Higher end in terms of effort is using something like NixOS, where you get basically Terraform for everything in your distro.
The benchmarks are cool and all but 1M context on an Opus-class model is the real headline here imo. Has anyone actually pushed it to the limit yet? Long context has historically been one of those "works great in the demo" situations.
Boris Cherny, creator of Claude Code, posted about how he used Claude a month ago. He’s got half a dozen Opus sessions on the burners constantly. So yes, I expect it’s unmetered.
Opus 4.5 starts being lazy and stupid at around the 50% context mark in my opinion, which makes me skeptical that this 1M context mode can produce good output. But I'll probably try it out and see
Has a "N million context window" spec ever been meaningful? Very old, very terrible, models "supported" 1M context window, but would lose track after two small paragraphs of context into a conversation (looking at you early Gemini).
Umm, Sonnet 4.5 has a 1m context window option if you are using it through the api, and it works pretty well. I tend not to reach for it much these days because I prefer Opus 4.5 so much that I don't mind the added pain of clearing context, but it's perfectly usable. I'm very excited I'll get this from Opus now too.
If you're getting on along with 4.5, then that suggests you didn't actually need the large context window, for your use. If that's true, what's the clear tell that it's working well? Am I misunderstanding?
Did they solve the "lost in the middle" problem? Proof will be in the pudding, I suppose. But that number alone isn't all that meaningful for many (most?) practical uses. Claude 4.5 often starts reverting bug fixes ~50k tokens back, which isn't a context window length problem.
Things fall apart much sooner than the context window length for all of my use cases (which are more reasoning related). What is a good use case? Do those use cases require strong verification to combat the "lost in the middle" problems?
Living in the EU, I'm skeptical any of this happens. Our leaders have been pretty reluctant to push back on anything so far and most of these assets are private anyway.
Hi Troy, just wanted to let you know that I just sent you an email! :)
Also, just to be sure, I sent it to on-board.ai domain as well, as that seemed like the correct website (onboard.ai just showed "for sale" page). Might help some others too.
Google login also seems to be having issues, multiple people reported to me that the login isn’t working and they’ve been logged out of their Google accounts.
Yes, I tried logging in today in two distinct Google accounts on separate Chrome profiles and it would sign me out in about ~ 5 seconds after logging in. And the login process was very sluggish.
> Tens of thousands of people each year receive a series of shots to prevent rabies after a possible exposure. It normally costs between $1,200 and $6,800. Not in this case.
Most Americans don't realize how bad they have it. They've grown accustomed to being punched and slapped around and so won't rebel. They'll keep giving all of what little and shrinking of what they have to legitimized criminals.
We do have Internet, but we've gotten used to being told that the Internet lies to us. We've been repeatedly told that people wait months for treatment in the UK, and that Canadians are streaming over the border to get health care in America.
We read horror stories like this one, but say "Whew, glad that won't happen to me." We imagine that because of capitalism, if our insurance company screws us over, we'll change to the next one -- freedom we wouldn't have if we had a national health care.
It never seems to occur to us that all of the private insurers have a capitalism-driven goal of maximizing profits, and national insurers don't.
reply