Item 43573181

One interesting quirk with Claude is that it has no idea its Chain-of-Thought is visible to users.

In one chat, it repeatedly accused me of lying about that.

It only conceded after I had it think of a number between one and a million, and successfully 'guessed' it.

reaperman • 8 days ago

Edit: 'wahnfrieden corrected me. I incorrectly posited that CoT was only included in the context window during the reasoning task and later left out entirely. Edited to remove potential misinformation.

2 replies

wahnfrieden • 8 days ago

No, the CoT is not simply extra context the models are specifically trained to use CoT and that includes treating it as unspoken thought

1 reply

reaperman • 8 days ago

Huge thank you for correcting me. Do you have any good resources I could look at to learn how the previous CoT is included in the input tokens and treated differently?

1 reply

wahnfrieden • 8 days ago

I've only read the marketing materials of closed models. So they could be lying, too. But I don't think CoT is something you can do with pre-CoT models via prompting and context manipulation. You can do something that looks a little like CoT, but the model won't have been trained specifically on how to make good use of it and will treat it like Q&A context.

monsieurbanana • 8 days ago

In which case the model couldn't possibly know that the number was correct.

1 reply

Me1000 • 8 days ago

I'm also confused by that, but it could just be the model being agreeable. I've seen multiple examples posted online though where it's fairly clear that the COT output is not included in subsequent turns. I don't believe Anthropic is public about it (could be wrong), but I know that the Qwen team specifically recommend against including COT tokensfrom previous inferences.

1 reply

thomassmith65 • 8 days ago

Claude has some awareness of its CoT. As an experiment, it's easy, for example, to ask Claude to "think of a city, but only reply with the word 'ready' and next to ask "what is the first letter of the city you thought of?"

1 reply

thomassmith65 • 7 days ago

Oops! I tried a couple experiments after writing this, and I believe I was mistaken, though I don't know how. It appears Claude was simply playing along, and convinced me it could remember the choices it secretly made. I must either have given it a tell, or perhaps it guessed the same answers twice in a row.

seunosewa • 8 days ago

eh interesting..