> ego getting in the way
That thing which was in fact identified thousands of years ago as the evil to ditch.
> reluctance to change our minds
That is clumsiness in a general drive that makes sense and is recognized part of the Belief Change Theory: epistemic change is conservative. I.e., when you revise a body of knowledge you do not want to lose valid notions. But conversely, you do not want to be unable to see change or errors, so there is a balance.
> it's not "true reasoning"
If it shows not to explicitly check its "spontaneous" ideas, then it is a correct formula to say 'it's not "true reasoning"'.
> then it is a correct formula to say 'it's not "true reasoning"'
why is that point fundamental?
Because the same way you do not want a human interlocutor to speak out of its dreams, uttering the first ideas that come to mind unvetted, and you want him to instead have thought hard and long and properly and diligently and well, equally you'll want the same from an automation.
If we do figure out how to vet these thoughts, would you call it reasoning?
> vet these thoughts, would you call it reasoning
Probably: other details may be missing, but checking one's ideas is a requirement. The sought engine must have critical thinking.
I have expressed very many times in the past two years, some times at length, always rephrasing it on the spot: the Intelligent entity refines a world model iteratively by assessing its contents.
I do see your point, and it is a good point.
My observation is that the models are better at evaluating than they are generating, this is the technique used in the o1 models. They will use unaligned hidden tokens as "thinking" steps that will include evaluation of previous attempts.
I thought that was a good approach to vetting bad ideas.
> My observation is that the [o1-like] models are better at evaluating than they are generating
This is very good (a very good thing that you see that the out-loud reasoning is working well as judgement),
but we at this stage face an architectural problem. The "model, exemplary" entities will iteratively judge and both * approximate the world model towards progressive truthfulness and completeness, and * refine their judgement abilities and general intellectual proficiency in the process. That (in a way) requires that the main body of knowledge (including "functioning", proficiency over the better processes) is updated. The current architectures I know are static... Instead, we want them to learn: to understand (not memorize) e.g. that Copernicus is better than Ptolemy and to use the gained intellectual keys in subsequent relevant processes.
The main body of knowledge - notions, judgements and abilities - should be affected in a permanent way, to make it grow (like natural minds can).
The static nature of LLMs is a compelling argument against the reasoning ability.
But, it can learn, albeit in a limited way, using the context. Though to my knowledge that doesn't scale well.