> but I think the author missed a variant hypothesis here:
> What if that specific model, when it recognizes chess notation, is trained to silently "tag out" for another, more specialized LLM, that is specifically trained on a majority-chess dataset? (Or — perhaps even more likely — the model is trained to recognize the need to activate a chess-playing LoRA adapter?)
Pretty sure your variant hypothesis is sufficiently covered by the author's writing.
So strange that people are so attached to conspiracy theories in this instance. Why would OpenAI or anyone go through all the trouble? The proposals outlined in the article make far more sense and track well with established research (namely that applying RLHF to a "text-only" model tends to wreak havoc on said model).
I don't disagree that the variant hypothesis is wrong; I just disagree that it was one covered by their writing.
The author was talking about "calling out to a chess engine" — and then explained why that'd be fragile and why everyone in the company would know if they did something like that.
The variant hypothesis is just that gpt-3.5-turbo was a testbed for automatic domain-specific LoRA adaptation, which wouldn't be fragile and which could be something that everyone in the company could know they did, without having any knowledge of one of their "test LoRAs" for that R&D project happening to involve a chess training data corpus.
The actual argument against the variant hypothesis is something like "why wouldn't we see automatic LoRA adaptation support in newer GPTs, then?" — but that isn't an argument the author makes.