No. The Turing test is that they can't be picked out in conversation.
Yes.
50% means that they are indistinguishable. Deviation from 50% means that the channel has information about whether the subject is human or LLM. 0% is a perfect correlation (humans always correctly identify humans). 100% is a perfect inverse correlation (humans always think the machine is a human)
You can identify LLMs by asking the human to pick the most human participant. Then you invert their answer. The real human is the least human like participant.
Maybe the point they're getting at is that LLMs are kind of too smart to be human any longer. A bit like how software drummers added a sliding "Humanize" parameter so that the drumming was "off" a bit.
ChatGPT needs to confuse "loose" and "lose" in its output, mistake the U.S. state "Georgia" with the country.
... and they are picked out in a conversation. As the conversant who is supposedly "less Human". TBH, that suggests some flaw either in the test or in people's presumptions regarding how humans behave.
> that suggests some flaw either in the test or in people's presumptions regarding how humans behave
Both. The Turing test is silly because it tests people's prejudices and presuppositions about machines, not objectively the machines themselves.
Also people's presumptions will quickly change as we get used to LLM output and we'll start detecting LLM speech with greater precision.