These look quite incredible. I work on a llama.cpp GUI wrapper and its quite sur...

thenameless7741 · 2025-05-01T02:44:26 1746067466

> it'll probably take a year for the FOSS community to implement and digest it completely

The local community seems to have converged on a few wrappers: Open WebUI (general-purpose), LM Studio (proprietary), and SillyTavern (for role-playing). Now that llama.cpp has an OpenAI-compatible server (llama-server), there's a lot more options to choose from.

I've noticed there really aren't many active FOSS wrappers these days - most of them have either been abandoned or aren't being released with the frequency we saw when OpenAI API first launched. So it would be awesome if you could share your wrapper with us at some point.

pzo · 2025-05-01T03:36:05 1746070565

I think OP means that FOSS didn't digest many multimodals of phi4-mini-multimodal such as Audio Input (STT) and Audio Output (TTS), also Image Input also not much supported in many FOSS.

thenameless7741 · 2025-05-01T04:32:24 1746073944

AFAIK, Phi-4-multimodal doesn't support TTS, but I understand OP's point.

The recent Qwen's release is an excellent example of model providers collaborating with the local community (which include inference engine developers and model quantizers?). It would be nice if this collaboration extended to wrapper developers as well, so that end-users can enjoy a great UX from day one of any model release.

refulgentis · 2025-05-01T04:53:05 1746075185

Hah, ty, I badly misunderstood the release materials

joshstrange · 2025-05-01T13:16:45 1746105405

I've been happier with LibreChat over Open WebUI. Mostly because I wasn't a fan of the `pipelines` stuff in Open WebUI and lack of MCP support (probably has changed now?). But then I don't love how LibreChat wants to push its (expensive) code runner service.

loufe · 2025-05-01T04:08:20 1746072500

Kobold.cpp is still my preference for a gui. Single portable exe with good flexibility in configuration if you want it, no need if not.

trc001 · 2025-05-01T02:52:21 1746067941

Oobabooga is still good as a Swiss Army knife sort of wrapper for a single user trying out new models

wd776g5 · 2025-05-01T01:46:34 1746063994

The linked article says 14B parameters. edit and I guess the "plus" model is 21B?

refulgentis · 2025-05-01T02:01:23 1746064883

grep "As seen above, Phi-4-mini-reasoning with 3.8B parameters outperforms models of over twice its size."

re: reasoning plus, "Phi-4-reasoning-plus builds upon Phi-4-reasoning capabilities, further trained with reinforcement learning to utilize more inference-time compute, using 1.5x more tokens than Phi-4-reasoning, to deliver higher accuracy.", presumably also 14B