Item 42207408

unoti • 7 days ago

I came here to basically say the same thing. The improvements the OP saw by asking it to repeat all the moves so far gives the LLM more time and space to think. I have this hypothesis giving it more time and space to think in other ways could improve performance even more, something like showing the current board position and asking it to perform an analysis of the position, list key challenges and strengths, asking it for a list of strategies possible from here, then asking it to select a strategy amongst the listed strategies, then asking it for its move. In general, asking it to really think rather than blurt out a move. The examples would be key here.

These ideas were proven to work very well in the ReAct paper (and by extension, the CoT Chain of Thought paper). Could also extend this by asking it to do this N times and stop when we get the same answer a majority of times (this is an idea stolen from the CoT-SC paper, chain of through self-consistency).

viraptor • 7 days ago

It would be awesome if the author released a framework to play with this. I'd like to test things out, but I don't want to spend time redoing all his work from scratch.

1 reply

fragmede • 7 days ago

Just have ChatGPT write the framework

1 reply

viraptor • 4 days ago

If it takes so little time and is so trivial, you're welcome to send me a link to your generated solution.