We've forked excalidraw a while ago to allow running excalidraw without firebase as a backend. This can already be self-hosted. It needs some love, but it's a good starting point:
In ollama, how do you set up the larger context, and figure out what settings to use? I've yet to find a good guide. I'm also not quite sure how I should figure out what those settings should be for each model.
There's context length, but then, how does that relate to input length and output length? Should I just make the numbers match? 32k is 32k? Any pointers?
Ollama breaks for me. If I manually set the context higher. The next api call from clone resets it back.
And ollama keeps taking it out of memory every 4 minutes.
LM studio with MLX on Mac is performing perfectly and I can keep it in my ram indefinitely.
Ollama keep alive is broken as a new rest api call resets it after. I’m surprised it’s this glitched with longer running calls and custom context length.
> In a Discord vocal on March 13th 2025 Michel Becker revealed that a documentary film explaining the solution would be shown in cinemas around France on May 2nd 2025. He hoped a broadcast in other countries would follow.
https://www.linkedin.com/posts/christian-kroll_bigtech-europ...
I've no clue why they don't have a blog post or anything up yet.