More

andai · 2026-02-02T01:42:13 1769996533

>Context transfer between [sub]agents is also poor

That's the main point of sub-agents, as far as I can tell. They get their own context, so it's much cheaper. You divide tasks into chunks, let a sub-agent handle each chunk. That actually ties in nicely with the emphasis on careful context management, earlier in the article.

andai · 2026-02-01T17:05:40 1769965540

>Before you reach for a frontier model, ask yourself: does this actually need a trillion-parameter model?

>Most tasks don't. This repo helps you figure out which ones.

About a year ago I was testing Gemini 2.5 Pro and Gemini 2.5 Flash for agentic coding. I found they could both do the same task, but Gemini Pro was way slower and more expensive.

This blew my mind because I'd previously been obsessed with "best/smartest model", and suddenly realized what I actually wanted was "fastest/dumbest/cheapest model that can handle my task!"

andai · 2026-02-01T17:03:06 1769965386

Do you mean with the coding plan?

I haven't tested it extensively but I found that when I used Claude Code with it, it was reasonably fast (but actual Claude was way faster), but when I tried to use the API itself manually, it would be super slow.

My guess would be think they're filtering the traffic and prioritizing certain types. On my own script, I ran into a rate limit after 7 requests!

andai · 2026-02-01T16:23:31 1769963011

Yeah, autonomy has the cost of your mental model getting desynchronized. You either follow along interactively or spend time catching up later.

andai · 2026-02-01T16:20:45 1769962845

Claude Code spends most of its time poking around the files. It doesn't have any knowledge of the project by default (no file index etc), unless they changed it recently.

When I was using it a lot, I created a startup hook that just dumped a file listing into the context, or the actual full code on very small repos.

I also got some gains from using a custom edit tool I made which can edit multiple chunks in multiple files simultaneously. It was about 3x faster. I had some edge cases where it broke though, so I ended up disabling it.

andai · 2026-02-01T16:15:06 1769962506

What happens if I do?

What's the difference between resetting a container or resetting a VPS?

On local machine I have it under its own user, so I can access its files but it cannot access mine. But I'm not a security expert, so I'd love to hear if that's actually solid.

On my $3 VPS, it has root, because that's the whole point (it's my sysadmin). If it blows it up, I wanna say "I'm down $3", but it doesn't even seem to be that since I can just restore it from an backup.

andai · 2026-02-01T16:13:00 1769962380

This is basically identical to the ChatGPT/GPT-3 situation ;) You know OpenAI themselves keep saying "we still don't understand why ChatGPT is so popular... GPT was already available via API for years!"

smokel · 2026-02-01T20:18:59 1769977139

ChatGPT is quite different from GPT. Using GPT directly to have a nice dialogue simply doesn't work for most intents and purposes. Making it usable for a broad audience took quite some effort, including RLHF, which was not a trivial extension.

andai · 2026-02-01T02:42:04 1769913724

It might have a joke name but it costs $80!

fangorn · 2026-02-01T02:55:30 1769914530

That's the real joke...

alwillis · 2026-02-01T07:45:51 1769931951

Or pay $19.99 for a year and be able to run it on 3 devices.

That's a pretty good deal.

andai · 2026-01-30T18:48:30 1769798910

I think most people are buying separate computers to run it on. This is a nice example of why you might want to do that.

(Though they're still hooking it up to their entire digital life, which also doesn't seem very reassuring.)

jstummbillig · 2026-01-30T18:50:37 1769799037

> I think most people are buying separate computers to run it on.

You must be joking.

pluralmonad · 2026-01-30T19:02:50 1769799770

I have a separate removable SSD I can boot from to work with Claude in a dedicated environment. It is nice being able to offload environment set up and what not to the agent. That environment has wifi credentials for an isolated LAN. I am much more permissive of Claude on that system. I even automatically allow it WebSearch, but not WebFetch (much larger injection surface). It still cannot do anything requiring sudo.

sph · 2026-01-30T19:20:11 1769800811

Man, let me tell you about virtual machines, it’s gonna blow your mind.

pluralmonad · 2026-01-30T19:59:10 1769803150

Call me old fashioned but I like my tangible approach.

jamwil · 2026-01-31T02:18:07 1769825887

You also get to run both systems on bare metal. Nothing wrong with this.

BryantD · 2026-01-30T19:04:07 1769799847

They are not. Many people are doing this; I don't think there's enough data to say "most," but there's at least anecdotal discussions of people buying Mac minis for the purpose. I know someone who's running it on a spare Mac mini (but it has Internet access and some credentials, so...).

phil21 · 2026-01-30T19:18:55 1769800735

Most tech enthusiasts I know have a myriad of computers laying around.

Spinning up a physical instance to try out some totally shady software is pretty standard stuff going back decades now.

andai · 2026-01-30T18:46:59 1769798819

Putting it on a VPS is genius. Putting it on a VPS you rely on... Yeah maybe not ;)