That gpt 4.5 was 73% successful is fascinating. It is almost as if humans have a fundamental flaw in detecting other humans which the LLM (+ the prompt )exploits.
We literally build modern models out of RLHF finetuning; the response styles that people like/engage with/approve the most are what the models generate.