> I invite anyone making that claim to find me an "advanced amateur" (as the article says of the LLM's level) chess player who ever makes an illegal move. Anyone familiar with chess can confirm that it doesn't really happen.
This is somewhat imprecise (or inaccurate).
A quick search on YouTube for "GM illegal moves" indicates that GMs have made illegal moves often enough for there to be compilations.
e.g. https://www.youtube.com/watch?v=m5WVJu154F0 -- The Vidit vs Hikaru one is perhaps the most striking, where Vidit uses his king to attack Hikaru's king.
A bunch of these are just improper procedure: several who hit the clock before choosing a promotion piece, and one who touches a piece that cannot be moved. Even those that aren't look like rational chess moves, they just fail to notice a detail of the board state (with the possible exception of Vidit's very funny king attack, which actually might have been clock manipulation to give him more time to think with 0:01 on the clock).
Whereas the LLM makes "moves" that clearly indicate no ability to play chess: moving pieces to squares well outside their legal moveset, moving pieces that aren't on the board, etc.
Can a blind man sculpt?
What if he makes mistakes that a seeing person would never make?
Does that mean that the blind man is not capable of sculpting at all?
> Whereas the LLM makes "moves" that clearly indicate no ability to play chess: moving pieces to squares well outside their legal moveset, moving pieces that aren't on the board, etc.
Do you have any evidence of that? TFA doesn't talk about the nature of these errors.
Yeah like several hundred "Chess IM/GMs react to ChatGPT playing chess" videos on youtube.
Very strange, I cannot spot any specifically saying that ChatGPT cheated or played an illegal move. Can you help?
https://www.youtube.com/watch?v=iWhlrkfJrCQ He has quite a few of these.
> Yeah like several hundred "Chess IM/GMs react to ChatGPT playing chess" videos on youtube.
If I were to take that sentence literally, I would ask for at least 199 other examples, but I imagine that it was just a figure of speech. Nevertheless, if that's only one player complaining (even several times), can we really conclude that ChatGPT cannot play? Is that enough evidence, or is there something else at work?
I suppose indeed one could, if one expected an LLM to be ready to play out of the box, and that would be a fair criticism.
I really wish I hadn't replied to you.
I'm sorry if you feel that way.
I am in no way trying to judge you; rather, I'm trying to get closer to the truth in that matter, and your input is valuable, as it points out a discrepancy wrt TFA, but it is also subject to caution, since it reports the results of only one chess player (right?). Furthermore, both in the case of TFA and this youtuber, we don't have full access to their whole experiments, so we can't reproduce the results, nor can we try to understand why there is a difference.
I might very well be mistaken though, and I am open to criticisms and corrections, of course.
But clearly the author got his GPT to play orders of magnitude better than in those videos
"Most striking" in the sense of "most obviously not ever even remotely legal," yeah.
But the most interesting and thought-provoking one in there is [1] Carlsen v Inarkiev (2017). Carlsen puts Inarkiev in check. Inarkiev, instead of making a legal move to escape check, does something else. Carlsen then replies to that move. Inarkiev challenges: Carlsen's move was illegal, because the only legal "move" at that point in the game was to flag down an arbiter and claim victory, which Carlsen didn't!
[1] - https://www.youtube.com/watch?v=m5WVJu154F0&t=7m52s
The tournament rules at the time, apparently, fully covered the situation where the game state is legal but the move is illegal. They didn't cover the situation where the game state was actually illegal to begin with. I'm not a chess person, but it sounds like the tournament rules may have been amended after this incident to clarify what should happen in this kind of situation. (And Carlsen was still declared the winner of this game, after all.)
LLM-wise, you could spin this to say that the "rational grandmaster" is as fictional as the "rational consumer": Carlsen, from an actually invalid game state, played "a move that may or may not be illegal just because it sounds kinda “chessy”," as zoky commented below that an LLM would have done. He responded to the gestalt (king in check, move the king) rather than to the details (actually this board position is impossible, I should enter a special case).
OTOH, the real explanation could be that Carlsen was just looking ahead: surely he knew that after his last move, Inarkiev's only legal moves were harmless to him (or fatalistically bad for him? Rxb7 seems like Inarkiev's correct reply, doesn't it? Again I'm not a chess person) and so he could focus elsewhere on the board. He merely happened not to double-check that Inarkiev had actually played one of the legal continuations he'd already enumerated in his head. But in a game played by the rules, he shouldn't have to double-check that — it is already guaranteed by the rules!
Anyway, that's why Carlsen v Inarkiev struck me as the most thought-provoking illegal move, from a computer programmer's perspective.
It’s exceedingly rare, though. There’s a big difference between accidentally falling to notice a move that is illegal in a complicated situation, and playing a move that may or may not be illegal just because it sounds kinda “chessy”, which is pretty much what LLMs do.
yes but LLM illegal moves often are not chessy at all. A chessy illegal move for instance would be trying to move a rook when you don't notice that it's between your king and an attacking bishop. LLMs would often happily play Ba4 when there's no bishop anywhere near a square from where it could reach that square, or even no bishop at all. That's not chessy, that's just weird.
I have to admit it's been a while since I played chatgpt so maybe it improved.
The one where Caruana improperly presses his clock and then claims he did not so as not to lose, and the judges believe him, is frustrating to watch.