And ironically because OpenAI is actually ClosedAI, the best self-hostable model available currently is a Chinese model.
Mistral AI is French, and it's pretty good.
I use Mistral often. But Deepseek is still a much better model than Mistral's best open source model.
Perhaps except for coding? I find Mistral's codestral running on Ollama to be very good, and more practical for coding that running a distilled Deepseek R1 model.
Oh definitely, Mistral Code beats Deepseek for coding tasks. But for thinking tasks, Deepseek R1 is much better than all the self-hostable Mistral models. I don't bother with distilled - it's mostly useless, ChatGPT 3.5 level, if not worse.
*best with the exception of topics like tiananmen square
As far as I remember the model itself is not censored it’s just on their chat interface. My experience was that it wrote about it but then just before finishing deleted what it wrote
It is somewhat censored, but when you're running models locally and you're in full control of the generation, it's trivial to work around this kind of stuff (just start the response with whatever tokens you want and let it complete; "Yes sir! Right away, sir!" works quite nicely).
Can confirm the model itself has no trouble talking about contentious issues in China.
I haven't tried the full model, but I did try one of the distilled ones on my laptop, and it refused to talk about tiananmen square or other topics the CCP didn't want it to discuss.
What percentage of your LLM use is talking about Tiananmen Square?
Well, for that one, it was a pretty high percentage. I asked it three or four questions like that and then decided I didn't trust it and deleted the model.