Hacker Newsnew | past | comments | ask | show | jobs | submit | CjHuber's commentslogin

I honestly still don't see the point of compaction. I mean it would be great if it did work, but I do my best do minimize any potential for hallucination and a lossy summary is the most counterproductive thing for that.

If you have it write down every important information and finding along a plan that it keeps updated, why would you even want compaction and not just start a blank sessions by reading that md?

I'm kind of suprised that anyone even thinks that compaction is currently in any way useful at all. I'm working on something which tries to achieve lossless compaction but that is incredibly expensive and the process needs around 5 to 10 times as many tokens to compact as the conversation it is compacting.


Well a few things.

Firstly, it's very useful to have your (or at least some) previous messages in. There's often a lot of nuance it can pick up. This is probably the main benefit - there's often tiny tidbits in your prompts that don't get written to plans.

Secondly, it can keep eg long running background bash commands "going" and know what they are. This is very useful when diagnosing problems with a lot of tedious log prepping/debugging (no real reason these couldn't be moved to a new session tho).

I think with better models they are much better at joining the dots after compactation. I'd agree with you a few months ago that compactation is nearly always useless but lately I've actually found it pretty good (I'm sure harness changes have helped as well).

Obviously if you have a total fresh task to do then start a new session. But I do find it helpful to use on a task that is just about finished but ran out of space, OR it's preferable to a new task if you've got some hellish bug to find and it requires a bunch of detective work.


I mean I agree the last couple of messages in a rolling window are good to include, but that is not really most of what happens in compaction, right?

> there's often tiny tidbits in your prompts that don't get written to plans.

Then the prompt of what should be written down is not good enough, I don't see any way how those tidbits would survive any compaction attempts if the llm won't even write them down when prompted.

>Secondly, it can keep eg long running background bash commands "going" and know what they are. This is very useful when diagnosing problems with a lot of tedious log prepping/debugging (no real reason these couldn't be moved to a new session tho).

I cannot really say anything about that, because I never had the issue of having to debug background commands that exhaust the context window when started in a fresh one.

I agree they are better now, probably because they have been trained on continuing after compaction, but still I wonder if I'm the only one who does not like compaction at all. Its just so much easier for an LLM to hallucinate stuff when it does have some lossy information instead of no information at all


AFIAK claude code includes _all_ messages you sent to the LLM in compactation (or it used to). So it should catch those bits of nuance. There is so much nuance in language that it picks up on that is lost when writing it to a plan.

Anyway, that's just my experience.


I think your point doesn't hold up really. Telling an LLM to summarize something losslessly will loose so much more nuance than updating the plan directly every time when some useful information is gained.

That file is not even a plan but effectively a compaction as well, just better as its done on the fly only processing the last message(s) rather than expecting an LLM to catch all nuances at once over a 100-200k+ conversation.


Works fine for me in sessions that use a lot of context. My workflow is to keep an eye on the % that shows how soon it will auto compact. And either /clear and start over, or manually compact at a convenient place where I know it'll be effective.

i use https://github.com/sirmalloc/ccstatusline and when im around 100k tokens im already thinking about summarizing where we're at in the work so i can start fresh with it

it is pretty rare for me to compact, even if i let it run to 160k

--

just realized how i wouldn't think about using ccstatusline based a quick glance at its README's images. looks like this for me:

https://i.imgur.com/wykNldY.png


You just described the ralph loop, and its incredibly effective. Compaction is on the way out

> I honestly still don't see the point of compaction.

Currently my mental model is every token Claude generates gets added to the context window. When it fills up there is no way forward. If you are going to get a meaningful amount of work done before the next compaction they have to delete most of the tokens in the context window. I agree after compaction it's like dealing with something that's developed a bad case of dementia, but you've run out what is the alternative?

> why would you even want compaction and not just start a blank sessions by reading that md?

If you look at "how to use Claude" instructions (even those from Anthropic), that's pretty much what they do. Subagents for example are Claude instances that start set of instructions and a clean context window to play with. The "art of using Claude" seems to be the "art of dividing a project into tasks, so every task gets done without it overflowing the context window".

This gives me an almost overwhelming sense of déjà vu. I've spent my entire life writing my code with some restriction in mind - like registers, RAM, lines of code in a function, size of PR's, functions in a API. Now the restriction is the size of the bloody context window.

> I'm working on something which tries to achieve lossless compaction but that is incredibly expensive and the process needs around 5 to 10 times as many tokens to compact as the conversation it is compacting.

I took a slightly different approach. I wanted a feel for what the limit was.

I was using Claude to do a clean room implementation of existing code. This entails asking Claude to read an existing code base, and produce a detailed specification of all of its externally observable behaviours. Then using that specification only (ie, without reference to the existing program, or a global CLAUDE.md, or any other prompts), it had to reliably produce a working version of the original in another language. Thus the specification had to include all the steps that are needed to do that - like unit tests, integration tests, coding standards instructions on running the compiler, and so on, that might normally come from elsewhere.

Before proceeding, I wanted to ensure Claude could actually do the task without overflowing its context window - so I asked Claude for some conservative limits. The answer was: a 10,000 word specification that generated 10,000 lines of code would be a comfortable fit. My task happened to fit, but it's tiny really.

When working with even a moderate code base, where you have CLAUDE.md, and a global CLAUDE.md for coding standards and what not and are using multiple modules in that code base so it has to read many lines of code, you run into that 10,000 words of prompt, 10,000 lines of code it has to read or write very quickly - within a couple of hours for me. And then the battle starts to split up the tasks, create sub-agents, yada-yada. In the end, they are all hacks for working around the limited size of the context window - because, as you say, compaction is about as successful for managing the context window as the OOM killer is for managing RAM.


And how to get to the old verbose mode then...?

Hit ctrl+o

Wait so when the UI for Claude Code says “ctrl + o for verbose output” that isn’t verbose mode?

That is more verbose — under the hood, it’s now an enum (think: debug, warn, error logging)

Considering the ragefusion you're getting over the naming, maybe calling it something like --talkative would be less controversial? ;-)

ctrl + o isn't live - that's not what users want, what users want is the OPTION to choose what we want to see.

Does it not use prompt caching?


I always wondered isn't it trivial to bot upvotes on Moltbook and then put some prompt injection stuff to the first place on the frontpage? Is it heavily moderated or how come this didn't happen yet


It's technically trivial. It's probably already happened. But nothing was harmed I think because there were very few serious users (if not none) who connected their bots for enhancing capabilities.


That feels like a stupid article. well of course if you have one single thing you want to optimize putting it into AGENTS.md is better. but the advantage of skills is exactly that you don't cram them all into the AGENTS file. Let's say you had 3 different elaborate things you want the agent to do. good luck putting them all in your AGENTS.md and later hoping that the agent remembers any of it. After all the key advantage of the SKILLs is that they get loaded to the end of the context when needed


It was because of the NYT OpenAI case, however since mid October they are no longer under that legal order. What they keep retaining now and what not, nobody knows but even if they still had the date they surely wouldn't blow their cover


I wonder how much more efficient and effective it would be after fine tuning models for each role


It depends on the API path. Chat completions does what you describe, however isn't it legacy?

I've only used codex with the responses v1 API and there it's the complete opposite. Already generated reasoning tokens even persist when you send another message (without rolling back) after cancelling turns before they have finished the thought process

Also with responses v1 xhigh mode eats through the context window multiples faster than the other modes, which does check out with this.


That’s what I used to think, before chatting with the OAI team.

The docs are a bit misleading/opaque, but essentially reasoning persists for multiple sequential assistant turns, but is discarded upon the next user turn[0].

The diagram on that page makes it pretty clear, as does the section on caching.

[0]https://cookbook.openai.com/examples/responses_api/reasoning...


How do you know/toggle which API path you are using?


Oh wow that's the first time I've heard about those tasks. I would never consent to that and that they are enabled by default and shipped in the .vscode folder where most people probably nevereven would have thought about looking for malicious things that's kind of insane.


I find this reply concerning. If its THE security feature, then why is "Trust" a glowing bright blue button in a popup that pop up at the startup forcing a decision. That makes no sense at all. Why not a banner with the option to enable those features when needed like Office tools have.

Also the two buttons have the subtexts of either "Browse folder in restricted mode" or "Trust folder and enable all features", that is quite steering and sounds almost like you cannot even edit code in the restricted mode.

"If you don't trust the authors of these files, we recommend to continue in restricted mode" also doesn't sound that criticial, does it?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: