This is written for the 3 models (Sonnet, Haiku, Opus 3). While some lessons will be relevant today, others will not be useful or necessary on smarter, RL’d models like Sonnet 4.5.
> Note: This tutorial uses our smallest, fastest, and cheapest model, Claude 3 Haiku. Anthropic has two other models, Claude 3 Sonnet and Claude 3 Opus, which are more intelligent than Haiku, with Opus being the most intelligent.
Yes, Chapters 3 and 6 are likely less relevant now. Any others? Specifically assuming the audience is someone writing a prompt that’ll be re-used repeatedly or needs to be optimized for accuracy.
That's not just them saving it locally to like `~/.claude/conversations`? Feels weird if all conversations are uploaded to the cloud + retained forever.
Even farther off topic, but this reminds me of the time my friends and I recorded a 3 minute long wav file that ended with a quiet “this is god. Can you hear me? I’d like to talk with you,” and set it to be the error sound on a friend’s PC.
Warning: A natural response to this is to instruct Claude not to do this in the CLAUDE.md file, but you’re then polluting the context and distracting it from its primary job.
If you watch its thinking, you will see references to these instructions instead of to the task at hand.
It’s akin telling an employee that they can never say certain words. They’re inevitably going to be worse at their job.
A tip for those who both use Claude Code and are worried about token use (which you should be if you're stuffing 400k tokens into context even if you're on 20x Max):
1. Build context for the work you're doing. Put lots of your codebase into the context window.
2. Do work, but at each logical stopping point hit double escape to rewind to the context-filled checkpoint. You do not spend those tokens to rewind to that point.
3. Tell Claude your developer finished XYZ, have it read it into context and give high level and low level feedback (Claude will find more problems with your developer's work than with yours).
If you want to have multiple chats running, use /resume and pull up the same thread. Hit double escape to the point where Claude has rich context, but has not started down a specific rabbit hole.
I think it is something else. If you think about it, humans often write about correcting errors done by others. Refactoring code, fixing bugs and write code more efficient. I guess it triggers other paths in the model, if we write that someone else did it. It is not about pleasing but our constant desire to improve things.
No. I have three MCPs installed and this is the only one that doesn’t need guidance. You’ll see it using it for search and finding references and such. It’s a one line install and no work to maintain.
The advantage is that Claude won’t have to use the file system to find files. And it won’t have to go read files into context to find what it’s looking for. It can use its context for the parts of code that actually matter.
And I feel like my results have actually been much better with this.
In my experience jumping back like this is risky unless you explicitly tell it you made changes, otherwise they will get clobbered because it will update files based on the old context.
Telling it to “re-read” xyz files before starting works though.
Why do you find this better than just starting again at that point? I'm trying to understand the benefit of using this 'trick', without being able to try it as I'm away from my computer.
Couldn't you start a new context and achieve the same thing, without any of the risks of this approach?
LLMs have no "memory" so it generally gives it something to go off, I forgot to add that I only do this if the change i'm making is related to whatever I did yesterday.
I do this because sometimes I just manually edit code and the LLM doesn't know everything that's happened.
I also find the. best way to work with "AI" is to make very small changes and commit frequently, I truly think it's a slot machine and if it does go wild, you can lose hours of work.
Hah, I do the same when I need to manually intervene to nudge the solution in the direction I want after a few failed attempts to recontruct my prompt to avoid some undesired path the LLM really wants to go down.
I usually tell CC (or opencode, which I've been using recently) to look up the files and find the relevant code. So I'm not attaching a huge number of files to the context. But I don't actually know whether this saves tokens or not.
the benefit is you can use your preferred editor. no need to learn a completely new piece of software that doesnt match your workflow just to get access to agentic workflows. for example, my entire workflow for the last 15+ years has been tmux+vim, and i have no desire to change that.
Quick tip when working with Claude Code and Git: When you're happy with an intermediate result, stage the changes by running `git add` (no commit). That makes it possible to always go back to the staged changes when Claude messes up. You can then just discard the unstaged changes and don't have to roll back to the latest commit.
You don't say that - you instruct the LLM to read files about X, Y, and Z. Putting the context in helps the agent plan better (next step) and write correct code (final step).
If you're asking the agent to do chunks of work, this will get better results than asking it to blindly go forth and do work. Anthropic's best practices guide says as much.
If you're asking the agent to create one method that accomplishes X, this isn't useful.
You don't have to think about it, you can just go try it. It doesn't work as well (yet) for me. I'm still way better than Claude at finding an initial heading.
In my experience, Claude will criticize others more than it will criticize itself. Seems similar to how LLMs in general tend to say yes to things or call anything a good idea by default.
I find it to be an entertaining reflection of the cultural nuances embedded into training data and reinforcement learning processes.
Interesting. In my experience, it's the opposite. Claude is too syncophantic. If you tell it that it was wrong, it will just accept your word at face value. If I give a problem to both Claude and Gemini, their responses differ and I ask Claude why Gemini has a different response - Claude will just roll over and tell me that Gemini's response was perfect and that it messed up.
This is why I was really taken by Gemini 2.0/2.5 when it first came out - it was the first model that really pushed back at you. It would even tell me that it wanted x additional information to continue onwards, unprompted. Sadly, as Google has neutered 2.5 over the last few months, its independent streak has also gone away, and its only slightly more individualistic than Claude/OpenAI's models.
I would guess the training data (conversational as opposed to coding specific solutions) is weighted towards people finding errors in others work, more than people discussing errors in their own. If you knew there was an error in your thinking, you probably wouldn't think that way.
It gives you the benefit of the doubt if you're coding.
It also gives you the benefit of the doubt if you're looking for feedback on your developers work. If you give it a hint of distrust "my developer says they completed this, can you check and make sure, give them feedback....?" Claude will look out for you.
> Note: This tutorial uses our smallest, fastest, and cheapest model, Claude 3 Haiku. Anthropic has two other models, Claude 3 Sonnet and Claude 3 Opus, which are more intelligent than Haiku, with Opus being the most intelligent.