More

xyc · 2025-08-06T00:09:10 1754438950

I'm running it with ROG Flow Z13 128GB Strix Halo and getting 50 tok/s for 20B model and 12 tok/s for 120B model. I'd say it's pretty usable.

organsnyder · 2025-08-06T16:54:49 1754499289

Excellent! I have a Framework Desktop with 128GB on preorder—really looking forward to getting it.

xyc · 2025-06-25T23:19:52 1750893592

Great to see more local AI tools supporting MCP! Recently I've also added MCP support to recurse.chat. When running locally (LLaMA.cpp and Ollama) it still needs to catch up in terms of tool calling capabilities (for example tool call accuracy / parallel tool calls) compared to the well known providers but it's starting to get pretty usable.

rshemet · 2025-06-26T00:01:30 1750896090

hey! we're building Cactus (https://github.com/cactus-compute), effectively Ollama for smartphones.

I'd love to learn more about your MCP implementation. Wanna chat?

xyc · 2025-06-25T23:13:54 1750893234

It's a protocol that doesn't dictate how you are calling the tool. You can use in-memory transport without needing to spin up a server. Your tool can just be a function, but with the flexibility of serving to other clients.

ijk · 2025-06-26T08:03:46 1750925026

Are there any examples of that? All the documentation I saw seemed to be about building an MCP server, with very little about connecting an existing inference infrastructure to local functions.

xyc · 2025-06-27T17:29:49 1751045389

For TypeScript you can refer to https://github.com/modelcontextprotocol/typescript-sdk/blob/...

There isn't much documentation available right now but you can ask coding agent eg. Claude Code to generate an example.

xyc · 2025-06-05T17:19:21 1749143961

recurse.chat + M2 max Mac

xyc · 2025-06-04T23:16:43 1749079003

I recently discovered toolhive which is pretty handy too https://github.com/stacklok/toolhive

xyc · 2025-05-17T00:39:10 1747442350

If you are on a Mac, give https://recurse.chat/ a try. As simple as download the model and start chatting. Just added the new multimodal support in LLaMA.cpp.

xyc · 2025-05-06T19:58:37 1746561517

Actually this is a good way to find product ideas. I placed a query in Grok to find posts about what people want, similar to this. Then it performs multiple searches on X including embedding search, and suggested people want stuff like tamagotchi, ICQ etc. back.

drilbo · 2025-05-07T05:48:43 1746596923

I feel like these are all great examples of things people think they want. Making a post about it is one thing, actually buying or using a product, I think the majority of nostalgic people will quickly remember why they don't actually want it in their adult life.

npunt · 2025-05-07T06:28:34 1746599314

I see this a lot in vintage computing. What we want is the feelings we had back then, the context, the growing possibilities, youth, the 90s, whatever. What we get is a semi-working physical object that we can never quite fix enough to relive those experiences. But we keep acquiring and fixing and tinkering anyway hoping this time will be different while our hearts become torn between past and present.

worldsayshi · 2025-05-07T06:03:44 1746597824

Yeah this is not even faster horses. It's horses that can count like Clever Hans.

xyc · 2025-05-06T17:47:24 1746553644

It seems that this is possibly not necessary, since LLaMA.cpp already integrates Jinja with CPP implementation (through minja)

xyc · 2024-12-30T23:47:45 1735602465

Check out https://recurse.chat/

xyc · 2024-12-10T18:16:06 1733854566

The fact that there's no alternative implementation of SQLite also seems to play a part in preventing standardization of WebSQL.

https://www.w3.org/TR/webdatabase/

"The specification reached an impasse: all interested implementors have used the same SQL backend (Sqlite), but we need multiple independent implementations to proceed along a standardisation path."

glommer · 2024-12-10T18:46:14 1733856374

I was completely unaware of that! How old is that document? I should reach out.

simonw · 2024-12-10T18:48:17 1733856497

That effort died about 14 years ago.

glommer · 2024-12-10T19:00:34 1733857234

damn =(

airstrike · 2024-12-10T21:11:23 1733865083

an opportunity for all of us to celebrate the astonishing power of necromancy

cryptonector · 2024-12-10T18:23:31 1733855011

Indeed! This sort of thing is a problem. It's the same with Internet protocols: you need at least two implementations to get to Standard.