> Many, many people suggested that there must be some special case in gpt-3.5-turbo-instruct that recognizes chess notation and calls out to an external chess engine.
Not that I think there's anything inherently unreasonable about an LLM understanding chess, but I think the author missed a variant hypothesis here:
What if that specific model, when it recognizes chess notation, is trained to silently "tag out" for another, more specialized LLM, that is specifically trained on a majority-chess dataset? (Or — perhaps even more likely — the model is trained to recognize the need to activate a chess-playing LoRA adapter?)
It would still be an LLM, so things like "changing how you prompt it changes how it plays" would still make sense. Yet it would be one that has spent a lot more time modelling chess than other things, and never ran into anything that distracted it enough to catastrophically forget how chess works (i.e. to reallocate some of the latent-space vocabulary on certain layers from modelling chess, to things that matter more to the training function.)
And I could certainly see "playing chess" as a good proving ground for testing the ability of OpenAI's backend to recognize the need to "loop in" a LoRA in the inference of a response. It's something LLM base models suck at; but it's also something you intuitively could train an LLM to do (to at least a proficient-ish level, as seen here) if you had a model focus on just learning that.
Thus, "ability of our [framework-mediated] model to play chess" is easy to keep an eye on, long-term, as a proxy metric for "how well our LoRA-activation system is working", without needing to worry that your next generation of base models might suddenly invalidate the metric by getting good at playing chess without any "help." (At least not any time soon.)
> but I think the author missed a variant hypothesis here:
> What if that specific model, when it recognizes chess notation, is trained to silently "tag out" for another, more specialized LLM, that is specifically trained on a majority-chess dataset? (Or — perhaps even more likely — the model is trained to recognize the need to activate a chess-playing LoRA adapter?)
Pretty sure your variant hypothesis is sufficiently covered by the author's writing.
So strange that people are so attached to conspiracy theories in this instance. Why would OpenAI or anyone go through all the trouble? The proposals outlined in the article make far more sense and track well with established research (namely that applying RLHF to a "text-only" model tends to wreak havoc on said model).
I don't disagree that the variant hypothesis is wrong; I just disagree that it was one covered by their writing.
The author was talking about "calling out to a chess engine" — and then explained why that'd be fragile and why everyone in the company would know if they did something like that.
The variant hypothesis is just that gpt-3.5-turbo was a testbed for automatic domain-specific LoRA adaptation, which wouldn't be fragile and which could be something that everyone in the company could know they did, without having any knowledge of one of their "test LoRAs" for that R&D project happening to involve a chess training data corpus.
The actual argument against the variant hypothesis is something like "why wouldn't we see automatic LoRA adaptation support in newer GPTs, then?" — but that isn't an argument the author makes.