Item 43625827

ChuckMcM • 10 days ago

I had the funny thought that this is exactly what a sentient AI would write "stop looking here, there is nothing to see, move along." :-)

I (like vannevar apparently) didn't feel Cyc was going anywhere useful, there were ideas there, but not coherent enough to form a credible basis for even a hypothesis of how a system could be constructed that would embody them.

I was pretty impressed by McCarthy's blocks world demo, later he and a student formalized some of the rules for creating 'context'[1] for AI to operate within, I continue to think that will be crucial to solving some of the mess that LLMs create.

For example, the early failures of LLMs suggesting that you could make salad crunchy by adding rocks was a classic context failure, data from the context of 'humor' and data from the context of 'recipes' intertwined. Because existing models have no context during training, there is nothing in the model that 'tunes' the output based on context. And you get rocks in your salad.

[1] https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&d...

musicale • 10 days ago

> there remains no evidence of its general intelligence

This seems like a high bar to reach.

We all know that symbolic AI didn't scale as well as LLMs trained on huge amounts of data. However, as you note, it also tried to address many things that LLMs still don't do well.

3 replies

ChuckMcM • 10 days ago

This is exactly correct, LLMs did scale with huge data, symbolic AI did not. So why? One of the things I periodically ask people working on LLMs is "what does a 'parameter' represent? The simplistic answer is 'it's a weight in a neural net node' but that doesn't much closer. Consider something like a bloom filter where a '0' bit represents the nth bit of all hashes of strings this filter has not seen. I would be interested in reading a paper that does a good job of explaining what a parameter ends up representing in an LLM model.[1]

I suspect that McCarthy was on to something with the context thing. Organic intelligence certainly fails in creative ways without context it would not be disqualifying to have AI fail in similarly spectacular ways.

[1] I made a bit of progress on this considering it to be the permeability for progress such that the higher the weight the easier it was to 'pass thorough' this particular neuron but the cyclic nature of the graph makes a purely topological explanation pretty obtuse :-).

4 replies

thesz • 9 days ago

> LLMs did scale with huge data, symbolic AI did not.

Symbolic AI have not had a privilege to be applied or "trained" with huge data. 30 millions assertions is not a big number.

1 reply

musicale • 9 days ago

This is correct. Those 30M assertions were basically entered by hand.

joe_the_user • 9 days ago

LLMs did scale with huge data, symbolic AI did not. So why? [1]

Neural networks, not LLMs in particular, were just about the simplest thing that could scale - they scaled and everything else has been fine-tuning. Symbolic AI basically begins with existing mathematical models of reality and of human reason and indeed didn't scale.

The problem imo is: The standard way mathematical modeling works[2] is you have a triple of <data, model-of-data, math-formalism>. The math formalism characterizes what the data could be, how data diverges from reality etc. The trouble is that the math formalism really doesn't scale even if a given model scales[3]. So even if you were to start plugging numbers into some other math model and get a reality-approximation like an LLM, it would be a black box like an LLM because the meta-information would be just as opaque.

Consider the way Judea Pearl rejected confidence intervals and claimed probabilities were needed as the building blocks for approximate reasoning systems. But a look at human beings, animals or LLMs shows that things that "deal with reality" don't have and couldn't access to "real" probabilities.

I'd just offer that I believe that for a model to scale, the vast majority of it's parameters would have to be mathematically meaningless to us. And that's for the above reasons.

[1]. Really key point, imo [2]. That innclude symbolic and probabilistic model "at the end of the day" [3]. Contrast the simplicity of plugging data into a regression model versus the multitudes of approaches explaining regression and loss/error functions etc.

krackers • 10 days ago

>I would be interested in reading a paper that does a good job of explaining what a parameter ends up representing in an LLM model.

https://distill.pub/2020/circuits/ https://transformer-circuits.pub/2025/attribution-graphs/bio...

1 reply

ChuckMcM • 9 days ago

That's an interesting paper and worth reading. Not sure it has answered my question but I did learn some things from it that I had not considered.

This was the quote I resonated with :-)

"... the discoveries we highlight here only capture a small fraction of the mechanisms of the model."

It sometimes feels a bit like papers on cellular biology with DNA discussions in which descriptions of the enzymes and proteins involved are insightful but the mechanism that operates the reaction remains opaque.

YeGoblynQueenne • 9 days ago

>> This is exactly correct, LLMs did scale with huge data, symbolic AI did not. So why?

Like the rock salad you're mixing up two disparate contexts here. Symbolic AI like SAT solvers and planners is not trying to learn from data and there's no context in which it has to "scale with huge data".

Instead, what modern SAT solvers and planners do is even harder than "scaling with data" - which, after all, today means having imba hardware and using it well. SAT solving and planning can't do that: SAT is NP-complete and planning is PSPACE-complete so it really doesn't matter how much you "scale" your hardware, those are not problems you can solve by scaling, ever.

And yet, today both SAT and planning are solved problems. NP complete? Nowadays, that's a piece of cake. There are dedicated solvers for all the classical sub-categories of SAT and modern planners can solve planning problems that require sequences of thousands of actions. Hell, modern planners can even play Atari games from pixels alone, and do very well indeed [1].

So how did symbolic AI manage those feats? Not with bigger computers but precisely with the approach that the article above seems to think has failed to produce any results: heuristic search. In SAT solving, the dominant approach is an algorithm called "Conflict Driven Clause Learning", that is designed to exploit the special structure of SAT problems. In Planning and Scheduling, heuristic search was always used, but work really took off in the '90s when people realised that they could automatically estimate a heuristic cost function from the structure of a planning problem.

There are parallel and similar approaches everywhere you look at, in classical AI problems, like verification, theorem proving, etc, and that work has even produced a few Turing awards [2]. But do you hear about that work at all, when you hear about AI research? No, because it works, and so it's not AI.

But it works, it runs on normal hardware, it doesn't need "scale" and it doesn't need data. You're measuring the wrong thing with the wrong stick.

____________

[1] Planning with Pixels in (Almost) Real Time: https://arxiv.org/pdf/1801.03354 Competitive results with humans and RL. Bet you didn't know that.

[2] E.g. Pnueli for temporal logic in verification, or Clarke, Emerson and Sifakis, for model checking.

4 replies

HarHarVeryFunny • 9 days ago

I think the problem with trying to hand-create symbolic rules for AI is that things like natural language, and the real world, are messy. Even with fuzzy rules you are never going to be able to accurately capture all the context dependencies and nuances, which may anyways be dynamic. Learning from real world data is the only realistic approach, although I don't think language models are the answer either - you need a system that is continually learning and correcting it's own errors.

CYC was an interesting experiment though. Even though it might have been expected to be brittle due to the inevitable knowledge gaps/etc, it seems there was something more fundamentally wrong with the approach for it not to have been more capable. An LLM could also be regarded as an expert system of sorts (learning its own rules from the training data), but some critical differences are perhaps that the LLM's rules are as much about recognizing context for when to apply a rule as what the rule itself is doing, and the rules are generative rather than declarative - directly driving behavior rather than just deductive closure.

1 reply

YeGoblynQueenne • 9 days ago

Yeas, hand-coding rules doesn't work in the long run. But burning through the world's resources to approximate a huge dataset isn't a viable long-term solution for anything either.

joe_the_user • 9 days ago

SAT is NP-complete and planning is PSPACE-complete so it really doesn't matter how much you "scale" your hardware, those are not problems you can solve by scaling, ever.

It seems like you are not framing NP-completeness properly. An NP complete problem is simply worst case hard. Such a problem can have many solvable instances. With some distributions of randomly selected SAT problem, most instances can be quickly solvable. SAT solving contests often involve hand-constructed SATs translated from other domains and the entrants similarly add methods for these "special cases". So NP-completeness isn't a barrier to SAT-solvers scaling by itself.

1 reply

YeGoblynQueenne • 9 days ago

I generally agree with your points, my point was mainly that the concept of "scaling" as meant in machine learning doesn't have an analogy in SAT solving and other classical AI tasks. Nobody's building large data centers to solve SAT problems and data is not "the new oil" in SAT solving, and in the other classical AI disciplines I mention above. In short, those are not data-driven fields.

thesz • 9 days ago

> Symbolic AI like SAT solvers and planners is not trying to learn from data and there's no context in which it has to "scale with huge data".

Actually, they do. Conflict-Driven Clause Learning (CDCL) learns from conflicts encountered during working on the data. The space of inputs they are dealing with oftentimes is in the order of the number of atoms in Universe and that is huge.

1 reply

YeGoblynQueenne • 9 days ago

"Learning" in CDCL is a misnomer: the learning process is Resolution and it's deductive (reasoning) not inductive (learning).

2 replies

thesz • 7 days ago

You invented a new kind of learning that somewhat contradicts usual definition [1] [2].

  [1] https://www.britannica.com/dictionary/learning
  [2] https://en.wikipedia.org/wiki/Learning

"Learning" in CDCL is perfectly in line of "gaining knowledge."

joe_the_user • 9 days ago

I'm pretty sure most "industrial scale" SAT solvers involve both deduction and heuristics to decide which deductions to make and which to keep. At a certain scale, the heuristics have to be adaptive and then you have "induction".

1 reply

YeGoblynQueenne • 9 days ago

I don't agree. The derivation of new clauses by Resolution is well understood as deductive and the choice of what clauses to keep doesn't change that.

Resolution can be used inductively, and also for abduction, but that's going into the weeds a bit- it's the subject of my PhD thesis. Let me know if you're in the mood for a proper diatribe :)

2 replies

thesz • 7 days ago

Take a look at Satisfaction-Driven Clause Learning [1].

[1] https://www.cs.cmu.edu/~mheule/publications/prencode.pdf

joe_the_user • 7 days ago

I'd love a diatribe if you're still following this post.

1 reply

EarlKing • 6 days ago

As would I.

You know, this seems like yet another reason to allow HN users to direct message each other, or at least receive reply notifications. Dang, why can't we have nice things?

1 reply

YeGoblynQueenne • 3 days ago

Oh, hi guys. Sorry just saw this.

Oh gosh I gotta do some work today, so no time to write what I wanted. Maybe watch this space? I'll try to make some time later today.

musicale • 9 days ago

> No, because it works, and so it's not AI

This is an important point. Hard "AI" problems are no longer "AI" once we have good algorithms and/or heuristics to solve them.

otabdeveloper4 • 10 days ago

Well, we haven't tried symbolic AI with huge amounts of data. It's a hard problem.

(And ironically this problem is much easier now that we have LLMs to help us clean and massage textual data.)

adastra22 • 10 days ago

Such as what? What can GOFAI do well that LLMs still cannot?

4 replies

sgt101 • 10 days ago

I think logical reasoning - so reasoning about logical problems, especially those with transitive relations like two way implication. A way round this is to get them to write prolog relations and then reason over them... with prolog. This isn't a fail - it's what things like prolog do, and not what things like nns do. If I was asked to solve these problems I would write prolog too.

I think quite a lot of planning.

I think scheduling - I tried something recently and GPT4 wrote python code which worked for very naive cases but then failed at any scale.

Basically though - trusted reasoning. Where you need a precise and correct answer LLM's aren't any good. They fail in the limit. But where you need a generally decent answer they are amazing. You just can't rely on it.

Whereas GOFAI you can, because if you couldn't the community thew it out and said it was impossible!

2 replies

miki123211 • 10 days ago

This has always been the case with ML.

ML is good at fuzzy stuff, where you don't have a clear definition of a problem (what is spam? what is porn?), "I know it when I see it", or when you don't have a clear mathematical algorithm to solve the problem (think "distinguishing dogs and cats").

When you have both (think sorting arrays, adding numbers), traditional programming (and that includes Prolog and the symbolic AI stuff) is much better.

LLMs will always be much worse than traditional computer programs at adding large numbers, just as traditional computer programs will always be worse at telling whether the person in the image is wearing proper safety equipment.

For best results, you need to combine both. Use LLMs for the "fuzzy stuff", converting imprecise English or pictures into JSON, Python, Wolfram, Prolog or some other representation that a computer can understand and work with, and then use the computer to do the rest.

Let's say you're trying to calculate how much proteins there are per 100 grams of a product, you have a picture of the label, but the label only gives you proteins per serving and the serving size in imperial units. The naive way most programmers try is to ask an LLM to give them proteins per 100g, which is obviously going to fail in some cases. The correct way is to ask the LLM for whatever unit it likes, and then do the conversion on the backend.

adastra22 • 9 days ago

I guess that's a fine distinction I don't make. If the problem requires the AI to write a prolog program to solve, and it is capable of writing the necessary prolog code, then I don't see the practical or philosophical difference from taking the transitive step and saying the AI solved it. If I asked you to solve an air traffic control problem and you did so by writing prolog, no one would try to claim you weren't capable of solving it.

Agentic LLMs can solve complicated reasoning and scheduling problems, by writing special-purpose solutions (which might resemble the things we call GOFAI). It's the nature of AGI--which LLMs assuredly are--that they can solve problems by inventing specialized tools, just as we do.

1 reply

cess11 • 9 days ago

Can you show us a log from when you gave an LLM a scheduling problem or something and it decided to solve it with Prolog or Z3 or something?

1 reply

adastra22 • 9 days ago

On mobile so I’m not sure how to export a chat log, but the following prompts worked with ChatGPT:

1: I need to schedule scientific operations for a space probe, given a lot of hard instrument and schedule constraints. Please write a program to do this. Use the best tool for the job, no matter how obscure.

2: This is a high-value NASA space mission and so we only get one shot at it. We need to make absolutely sure that the solution is correct and optimal, ideally with proofs.

3: Please code me up a full example, making up appropriate input data for the purpose of illustration

I got an implementation that at first glance looks correct using the MiniZinc constraint solver. I’m sure people could quibble, but I was not trying to lead the model in any way. The second prompt was because the first generated a simple python program, and I think it was because I didn’t specify that it was a high value project that needed mission assurance at the start. A better initial prompt would’ve gotten the desired result on the first try.

musicale • 10 days ago

"Tried to address" is not the same as "can do well."

I was responding to PP, but some other (maybe obvious?) examples are logical reasoning and explainability.

As PP suggests, some of the classical symbolic ideas may be applicable or complementary to current approaches.

1 reply

musicale • 9 days ago