Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

API wise, it looks very similar to the OpenAI python SDK but not quite the same. I was hoping I could swap out one client for another. Can anyone confirm they’re intentionally using an incompatible interface?


There is an issue for this: [1]. I think it's more of priority issue.

[1] https://github.com/ollama/ollama/issues/305


Same question here. Ollama is fantastic as it makes it very easy to run models locally, But if you already have a lot of code that processes OpenAI API responses (with retry, streaming, async, caching etc), it would be nice to be able to simply switch the API client to Ollama, without having to have a whole other branch of code that handles Ollama API responses. One way to do an easy switch is using the litellm library as a go-between but it’s not ideal.

For an OpenAI compatible API my current favorite method is to spin up models using oobabooga TGW. Your OpenAI API code then works seamlessly by simply switching out the api_base to the ooba endpoint. Regarding chat formatting, even ooba’s Mistral formatting has issues[1] so I am doing my own in Langroid using HuggingFace tokenizer.apply_chat_template [2]

[1] https://github.com/oobabooga/text-generation-webui/issues/53...

[2] https://github.com/langroid/langroid/blob/main/langroid/lang...

Related question - I assume ollama auto detects and applies the right chat formatting template for a model?


I've built exactly this if you want to give it a try : https://github.com/lhenault/simpleAI




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: