griomnib 7 days ago

Likewise with LLM you don’t know if it is truly in the “chess” branch of the statistical distribution or it is picking up something else entirely, like some arcane overlap of tokens.

So much of the training data (eg common crawl, pile, Reddit) is dogshit, so it generates reheated dogshit.

1
Helonomoto 7 days ago

You generalize this without mentioning that there are LLMs which do not just use random 'dogshit'.

Also what does a normal human do? It looks around how to move one random piece and it uses a very small dictionary / set of basic rules to move it. I do not remember me learning to count every piece and its options by looking up that rulebook. I learned to 'see' how i can move one type of chess piece.

If a LLM uses only these piece moves on a mathematical level, it would do the same thing as i do.

And yes there is also absolutly the option for an LLM to learn some kind of meta game.