tommica 19 hours ago

Sidetangent: why is ollama frowned upon by some people? I've never really got any other explanation than "you should run llama.CPP yourself"

9
lhl 19 hours ago

Here's some discussion here: https://www.reddit.com/r/LocalLLaMA/comments/1jzocoo/finally...

Ollama appears to not properly credit llama.cpp: https://github.com/ollama/ollama/issues/3185 - this is a long-standing issue that hasn't been addressed.

This seems to have leaked into other projects where even when llama.cpp is being used directly, it's being credited to Ollama: https://github.com/ggml-org/llama.cpp/pull/12896

Ollama doesn't contributed to upstream (that's fine, they're not obligated to), but it's a bit weird that one of the devs claimed to have and uh, not really: https://www.reddit.com/r/LocalLLaMA/comments/1k4m3az/here_is... - that being said they seem to maintain their own fork so anyone could cherry pick stuff it they wanted to: https://github.com/ollama/ollama/commits/main/llama/llama.cp...

tommica 12 hours ago

Thanks for the good explanation!

diggan 13 hours ago

Besides the "culture"/licensing/FOSS issue already mentioned, I just wanted to be able to reuse model weights across various applications, but Ollama decided to ship their own way of storing things on disk + with their own registry. I'm guessing it's because they want to eventually be able to monetize this somehow, maybe "private" weights hosted on their registry or something. I don't get why they thought splitting up files into "blobs" made sense for LLM weights, seems they wanted to reduce duplication (ala Docker) but instead it just makes things more complicated for no gains.

End result for users like me though, is to have to duplicate +30GB large files just because I wanted to use the weights in Ollama and the rest of the ecosystem. So instead I use everything else that largely just works the same way, and not Ollama.

tommica 12 hours ago

That is an interesting perspective, did not know about that at all!

speedgoose 18 hours ago

To me, Ollama is a bit the Docker of LLMs. The user experience is inspired and the model file syntax is also inspired by the Dockerfile syntax. [0]

In the early days of Docker, we had the debate of Docker vs LXC. At the time, Docker was mostly a wrapper over LXC and people were dismissing the great user experience improvements of Docker.

I agree however that the lack of acknowledgement to llama.cpp for a long time has been problematic. They acknowledge the project now.

[0]: https://github.com/ollama/ollama/blob/main/docs/modelfile.md

wirybeige 10 hours ago

They refuse to work with the community. There's also the open question of how they are going to monetize, given that they are a VC-backed company.

Why shouldn't I go with llama.cpp, lmstudio, or ramalama (containers/RH); I will at least know what I am getting with each one.

Ramalama actually contributes quite a bit back to llama.cpp/whipser.cpp (more projects probably), while delivering a solution that works better for me.

https://github.com/ollama/ollama/pull/9650 https://github.com/ollama/ollama/pull/5059

octocop 18 hours ago

For me it's because ollama is just a front-end for llama.cpp, but the ollama folks rarely acknowledge that.

gavmor 19 hours ago

Here's a recent thread on Ollama hate from r/localLLaMa: https://www.reddit.com/r/LocalLLaMA/comments/1kg20mu/so_why_...

kergonath 9 hours ago

r/localLLaMa is very useful, but also very susceptible to groupthink and more or less astroturfed hype trains and mood swings. This drama needs to be taken in context, there is a lot of emotion and not too much reason.

bearjaws 11 hours ago

Anyone who has been around for 10 years can smell the Embrace, Extend, Extinguish model 100 miles away.

They are plainly going to capture the market, and switch to some "enterprise license" that lets them charge $, on the backs of other peoples work.

buyucu 17 hours ago

I abandoned Ollama because Ollama does not support Vulkan: https://news.ycombinator.com/item?id=42886680

You have to support Vulkan if you care about consumer hardware. Ollama devs clearly don't.

nicman23 19 hours ago

cpp was just faster and with more features that is all

cwillu 19 hours ago

cpp is the thing doing all the heavy lifting, ollama is just a library wrapper.

It'd be like if handbrake tried to pretend that they implemented all the video processing work, when it's dependent on libffmpeg for all of that.

diggan 13 hours ago

> ollama is just a library wrapper.

Was.

This submission is literally about them moving away from being just a wrapper around llama.cpp :)

buyucu 11 hours ago

no they are not. the submission uses ggml, which is llama.cpp

diggan 10 hours ago

I think you misunderstand how these pieces fit together. llama.cpp is library that ships with a CLI+some other stuff, ggml is a library and Ollama has "runners" (like an "execution engine"). Previously, Ollama used llama.cpp (which uses ggml) as the only runner. Eventually, Ollama made their own runner (which also uses ggml) for new models (starting with gemma3 maybe?), still using llama.cpp for the rest (last time I checked at least).

ggml != llama.cpp, but llama.cpp and Ollama are both using ggml as a library.

cwillu 8 hours ago

“The llama.cpp project is the main playground for developing new features for the ggml library” --https://github.com/ggml-org/llama.cpp

“Some of the development is currently happening in the llama.cpp and whisper.cpp repos” --https://github.com/ggml-org/ggml

diggan 3 hours ago

Yeah, those both makes sense. ggml was split from llama.cpp once they realized it could be useful elsewhere, so while llama.cpp is the "main playground", it's still used by others (including llama.cpp). Doesn't mean suddenly that llama.cpp is the same as ggml, not sure why you'd believe that.