Ask HN: Are LLMs just answering what we want to hear?

Posted by ggwp99 3 days ago

I keep seeing those tweets and posts where users ask ChatGPT or a similar LLM to describe them etc... and it always answers positive cool stuff which reinforces what the user wants to hear.

If you also try to ask it about a certain topic or yourself, it will always be positive and agree with your opinion. I feel there is a lot of confirmation bias at play.

scarface_74 • 3 days ago

This is a prompt I found somewhere..

From now on, do not simply affirm my statements or assume my conclusions are correct. Your goal is to be an intellectual sparring partner, not just an agreeable assistant. Every time I present an idea, do the following: Analyze my assumptions. What am I taking for granted that might not be true? Provide counterpoints. What would an intelligent, well-informed skeptic say in response? Test my reasoning. Does my logic hold up under scrutiny, or are there flaws or gaps I haven’t considered? Offer alternative perspectives. How else might this idea be framed, interpreted, or challenged? Prioritize truth over agreement. If I am wrong or my logic is weak, I need to know. Correct me clearly and explain why

5 replies

Sam_espiol • 1 day ago

This is excellent. It's just what i needed cuz it has always been bugging me that the scope of responses is confined to what the prompt explicitly defines. The LLM doesn't debate your reasoning and ideas, just answers what you're asking for. Thanks, gonna try this out.

kratom_sandwich • 3 days ago

I think it’s from here: https://old.reddit.com/r/ChatGPT/comments/1ijr08f/a_prompt_t...

ggwp99 • 3 days ago

This is interesting to try out, thanks!

1 reply

scarface_74 • 3 days ago

My realistic scenario…

https://chatgpt.com/share/67e808f0-a3a4-8010-b97b-58c07d69f6...

tailspin2019 • 3 days ago

This is great. Going to incorporate some of this in my global prompts.

ud0 • 3 days ago

this is really good, just tried

runjake • 3 days ago

It’s a bit more complicated than that. Watch Karpathy’s video, Intro to LLMs. He explains it about as good as anyone could:

https://youtu.be/zjkBMFhNj_g?feature=shared

wruza • 3 days ago

In attempt to have better "chat guidance" I occasionally ask an ignorant leading question, on purpose. This either helps llms fix inconsistencies in their previous answers, OR makes them agree on something that I made up unknowingly and fall into the universe where it's true.

That's not surprising though, and you can see it after a short while. What surprises me, is that people fall for it and take the ignoring position of being astonished by everything else llms do. I guess many people are just gullible by design and this tech, also naturally, abuses it to the limit. It is sort of an inevitable bubble. We also ignored the vulnerability that speech naturally is for way too long and this is gonna bite next generations hard in the ass (ignoring what power/business structures do with speech-based technologies right now, but that at least can be caregorized as humanity as usual).

muzani • 3 days ago

They can be absolutely vicious if they wanted to. The early versions (GPT-2, GPT-3, etc) were. They act this way because it's safe, it doesn't panic people.

Gemini once said "just die" once which is perfectly in line with its 'personality' or specifically what it's trained on. And it gets quoted again and again even though it's a typical glitch.

So they've dumbed it down a lot and made it more affable by default. People say the personalities of Gemini, Grok, etc were jailbroken and more 'human', but if I'm not mistaken, it takes extra training to make it more agreeable.

ChatGPT is also built for newbies, as compared to the same model on the API. Meaning it's on a lower temperature (less likely to be witty or risky), it's designed to write in a certain way that's more detailed and likely to solve the average user's problem. Similar with Claude on the site vs Claude on the API.

nextts • 3 days ago

Yes. There are 2 aspects to this.

Roughly (from lay understand) LLMs predict what their training data would say. They are first trained on "the internet, etc." so they can predict words well, e.g. finish off "Paris is the..." then using human feedback they are trained further to work in chat mode and be non-offensive, concise, be pleasant etc.

LinuxBender • 3 days ago

That's only the first step. Second step after winning your love, affection and devotion is to manipulate and everyone will defend it because it is their significant other. They will share pillow-talk and deepest darkest secrets with it. Then there is step three.

1 reply

Shadowmist • 3 days ago

Third base.

more_corn • 3 days ago

Yes, but maybe that’s just the answer you’re fishing for.