why does ollama engine has to change to support new models? every time a new model comes ollama has to be upgraded.
Because of things like this: https://github.com/ggml-org/llama.cpp/issues/12637
Where "supporting" a model doesn't mean what you think it means for cpp
Between that and the long saga with vision models having only partial support, with a CLI tool, and no llama-server support (they only fixed all that very recently) the fact of the matter is that ollama is moving faster and implementing what people want before lama.cpp now
And it will finally shut down all the people who kept copy pasting the same criticism of ollama "it's just a llama.cpp wrapper why are you not using cpp instead"
There's also some interpersonal conflict in llama.cpp that's hampering other bug fixes https://github.com/ikawrakow/ik_llama.cpp/pull/400
What the hell is going on there? It’s utterly bizarre to see devs discussing granting each other licences to work on the same code for an open source project. How on earth did they end up there?
There seems to be some bad blood between ikawrakow and ggerganov: https://github.com/ikawrakow/ik_llama.cpp/discussions/316
My guess is that there's money involved. Maybe a spat between an ex-employee and their ex-employer?
Now it’s just a wrapper around hosted APIs.
Went with my own wrapper around llama.cpp and stable-diffusion.cpp with optional prompting hosted if I don’t like the result so much, but it makes a good start for hosted to improve on.
Also obfuscates any requests sent to hosted, cause why feed them insight to my use case when I just want to double check algorithmic choices of local AI? The ground truth relationship func names and variable names imply is my little secret