More

manishsharan · 2025-11-06T21:37:26 1762465046

How.. please don't say use langxxx library

I am looking for a language or library agnostic pattern like we have MVC etc. for web applications. Or Gang of Four patterns but for building agents.

tptacek · 2025-11-06T21:39:35 1762465175

The whole post is about not using frameworks; all you need is the LLM API. You could do it with plain HTTP without much trouble.

manishsharan · 2025-11-06T21:45:22 1762465522

When I ask for Patterns, I am seeking help for recurring problems that I have encountered. Context management .. small llms ( ones with small context size) break and get confused and forget work they have done or the original goal.

zahlman · 2025-11-06T23:01:02 1762470062

Start by thinking about how big the context window is, and what the rules should be for purging old context.

Design patterns can't help you here. The hard part is figuring out what to do; the "how" is trivial.

skeledrew · 2025-11-06T22:50:47 1762469447

That's why you want to use sub-agents which handle smaller tasks and return results to a delegating agent. So all agents have their own very specialized context window.

tptacek · 2025-11-06T22:56:02 1762469762

That's one legit answer. But if you're not stuck in Claude's context model, you can do other things. One extremely stupid simple thing you can do, which is very handy when you're doing large-scale data processing (like log analysis): just don't save the bulky tool responses in your context window once the LLM has generated a real response to them.

My own dumb TUI agent, I gave a built in `lobotomize` tool, which dumps a text list of everything in the context window (short summary text plus token count), and then lets it Eternal Sunshine of the Spotless Agent things out of the window. It works! The models know how to drive that tool. It'll do a series of giant ass log queries, filling up the context window, and then you can watch as it zaps things out of the window to make space for more queries.

This is like 20 lines of code.

adiasg · 2025-11-07T00:15:31 1762474531

Did something similar - added `summarize` and `restore` tools to maximize/minimize messages. Haven't gotten it to behave like I want. Hoping that some fiddling with the prompt will do it.

lbotos · 2025-11-07T00:32:58 1762475578

FYI -- I vouched for you to undead this comment. It felt like a fine comment? I don't think you are shadowbanned but consider emailing the mods if you think you might me.

oooyay · 2025-11-06T21:49:50 1762465790

I'm not going to link my blog again but I have a reply on this post where I link to my blog post where I talk about how I built mine. Most agents fit nicely into a finite state machine or a directed acyclic graph that responds to an event loop. I do use provider SDKs to interact with models but mostly because it saves me a lot of boilerplate. MCP clients and servers are also widely available as SDKs. The biggest thing to remember, imo, is to keep the relationship between prompts, resources, and tools in mind. They make up a sort of dynamic workflow engine.

manishsharan · 2025-11-06T20:50:49 1762462249

Fine .. say your country has a several years of drought and bad harvest. What happens then ? Do you trade then ?

Or .. lets say due to weather, your farmers can not grow enough oranges or some fruit which drives up local prices. Should only the richest people in your country get to eat fruits ?

Or you discover lithium deposits that your national industry can not use . Should you let that just sit there knowing it could make your province prosperous if traded.

foxglacier · 2025-11-12T21:00:15 1762981215

You took a far too extreme interpretation that ended up backwards. What normal countries do is trade anyway but with tariffs and subsidies so that in normal circumstances, local food is competitive to keep farmers operating but if there's a local production problem, imported goods become more competitive. That buffers the population from extremes of price/availability.

golemotron · 2025-11-06T22:27:07 1762468027

Sure, you can trade but it is a choice. Claiming that free trade is universally good is saying that there is only one right choice - no barriers.

manishsharan · 2025-10-31T15:59:32 1761926372

I am here to hear from folks running LLM on Framework desktop (128GB). Is it usable for agentic coding ?

strangattractor · 2025-10-31T17:15:24 1761930924

Just started going down that route myself. For the money it performs well and runs most of the models at reasonable speeds.

1. Thermal considerations are important due to throttling for thermal protection. Apple seems best at this but $$$$. The Framework (AMD) seems a reasonable compromise (you can have almost 3 for 1 Mini). Laptops will likely not perform as well. NVIDIA seems really bad at thermal/power considerations.

2. Memory model matters and AMD's APU design is an improvement. NVIDIA GPUs where designed for graphics but where better than CPUs for AI so they got used. Bespoke AI solutions will eventually dominate. That may or may not be NVIDIA in the future.

My primary interest is AI at the edge.

manishsharan · 2025-10-20T16:30:03 1760977803

Thanks for sharing. TIL about rerankers.

Chunking strategy is a big issue. I found acceptable results by shoving large texts to to gemini flash and have it summarize and extract chunks instead of whatever text splitter I tried. I use the method published by Anthropic https://www.anthropic.com/engineering/contextual-retrieval i.e. include full summary along with chunks for each embedding.

I also created a tool to enable the LLM to do vector search on its own .

I do not use Langchain or python.. I use Clojure+ LLMs' REST APIs.

crassT · 2025-10-24T09:35:02 1761298502

I made a startup, https://tokencrush.ai/, to do just this.

I've struggled to find a target market though. Would you mind sharing what your use case is? It would really help give me some direction.

esafak · 2025-10-20T16:46:11 1760978771

Have you measured your latency, and how sensitive are you to it?

manishsharan · 2025-10-20T17:04:02 1760979842

>> Have you measured your latency, and how sensitive are you to it?

Not sensitive to latency at all. My users would rather have well researched answers than poor answers.

Also, I use batch mode APIs for chunking .. it is so much cheaper.

manishsharan · 2025-10-20T15:10:46 1760973046

You mean multi-cloud strategy ! You wanna know how you got here ?

See the sales team from Google flew out an executive to NBA Finals, Azure Sales team flew out another executive to NFL superBowl and the AWS team flew out yet another executive to Wimbledon finals. And thats how you end up with multi-cloud strategy.

ibejoeb · 2025-10-20T16:10:22 1760976622

In this particular case, it was resume-oriented architecture (ROAr!) The original team really wanted to use all the hottest new tech. The management was actually rather unhappy, so the job was to pare that down to something more reliable.

kevstev · 2025-10-20T15:49:27 1760975367

Eh, businesses want to stay resilient to a single vendor going down. My least favorite question in interviews this past year was around multi-cloud. Because imho it just isn't worth it- the increased complexity, the trying to like-like services across different clouds that aren't always really the same, and then just the ongoing costs of chaos monkeying and testing that this all actually works, especially in the face of a partial outage like this vs something "easy" like a complete loss of network connectivity... but that is almost certainly not what CEOs want to hear (mostly who I am dealing with here going for VPE or CTO level jobs).

I could care less about having more vendor dinners when I know I am promising a falsehood that is extremely expensive and likely going to cost me my job or my credibility at some point.

pluto_modadic · 2025-10-20T17:40:15 1760982015

sticker shock / looking at alternative vendors

manishsharan · 2025-10-05T19:13:17 1759691597

>> The CCP literally revoked the visas of key DeepSeek engineers. That's all we need to know.

I don't follow. Why would DeepSeek engineers need visa from CCP?

manishsharan · 2025-10-03T15:28:17 1759505297

I have a branch office in boondocks with limited internet connection. The branch office cannot manage a RDBMS or access cloud services. They can use sqlite app on LAN and we could do reconciliation at end of the business day.

skeeter2020 · 2025-10-03T18:09:03 1759514943

they can also run the entire application in these scenarios on the resources of a 10-yr-old phone.

manishsharan · 2025-10-03T15:18:15 1759504695

https://github.com/microsoft/graphrag

This is not agentic but pretty good results when I did a poc.

manishsharan · 2025-10-02T13:40:59 1759412459

Most developers can't do much work without an IDE and Chrome + Google.

Would you say that their work has no value?

johnnyanmac · 2025-10-03T01:29:50 1759454990

This is probably the only place I can properly say "Programmers should be brought up with vim and man pages", so I'll say it here.

Anyways, IDE's don't try to offload the thinking for you, it's more like an abacus. You still need to work in it a while and learn the workflow before it's more efficient than a text editor + docs.

Chrome is a trickier aspect, because the reality is that a lot of modern docs completely suck. So you rely less on official documentation and more about how others have navigated an IDE and if those options work for you. I'd rather we make proper documentation than offload it into a black box that may or may not understand what it's spouting out to you, though.

manishsharan · 2025-10-02T13:12:10 1759410730

This is exactly my experience. We wanted to modernize a java codebase by removing java JNDI global variables. This is a simple though tedious task. And we tried Claude Code and Gemini. Both of these results were hilarious.

Workaccount2 · 2025-10-02T14:30:57 1759415457

LLMs are awful at tedious tasks. Usually because it involves massive context.

You will have much more success if you can compartmentalize and use new LLM instances as often as possible.