yet - likely subscription grade will stay ahead of the curve, but we will soon have very decent models running locally for very cheap - like when you play great videogames that are 2/3 years old on now “cheap”machines
I tried running the DeepSeek models that would run on my 32GB macbook and they were interesting. They could still produce good conversation but didn't seem to have the entirety of the internet in it's knowledge pool. Asking it complex questions lead to it only offering high level descriptions and best guess answers.
Feel like they would still be great for a lot of applications like "Search my local hard drive for the file that matches this description"
Yeah, Internet search as a fallback, our chat history and "saved info" in the context ... there's a lot OpenAI, et. al. give you that Ollama does not.
You can get those in ollama using tools (MCP).
Had to ask ChatGPT what MCP (Model Context Protocol) referred to.
When I followed up with how to save chat information for future use in the LLM's context window, I was given a rather lengthy process involving setting up an SQL database, writing some Python tp create a "pre-prompt injection wrapper"....
That's cool and all, but wishing there was something a little more "out of the box" that did this sort of thing for the "rest of us". GPT did mention Tome, LM Studio, a few others....
Not sure if you're aware of it, but Open Web UI is a nice (looks like ChatGPT) Web client for ollama and anything else. It also seems to support MCP.
I did play with MCP and ollama a little, but that was scripted, not interactive.