RussianCow 5 days ago

LLMs are still very young. We'll get there in time. I don't see how it's any different than optimizing for new CPU/GPU architectures other than the fact that the latter is now a decades-old practice.

2
th0ma5 5 days ago

Not to pick on you, but this is exactly the objectionable handwaving. What makes you think we'll get there? The kinds of errors that these technologies make have not changed, and anything that anyone learns about how to make them better changes dramatically from moment to moment and no one can really control that. It is different because those other things were deterministic ...

Closi 5 days ago

In comp sci it’s been deterministic, but in other science disciplines (eg medicine) it’s not. Also in lots of science it looks non-deterministic until it’s not (eg medicine is theoretically deterministic, but you have to reason about it experimentally and with probabilities - doesn’t mean novel drugs aren’t technological advancements).

And while the kind of errors hasn’t changed, the quantity and severity of the errors has dropped dramatically in a relatively short span of time.

th0ma5 4 days ago

The problem has always been that every token is suspect.

Closi 3 days ago

It's the whole answer being correct that's the important thing, and if you compare GPT 3 vs where we are today only 5 years later the progress in accuracy, knowledge and intelligence is jaw dropping.

th0ma5 1 day ago

I have no idea what you're talking about because they still screw up in the exact same way as gpt3.

girvo 5 days ago

> I don't see how it's any different than optimizing for new CPU/GPU architectures

I mean that seems wild to say to me. Those architectures have documentation and aren't magic black boxes that we chuck inputs at and hope for the best: we do pretty much that with LLMs.

If that's how you optimise, I'm genuinely shocked.

swyx 5 days ago

i bet if we talked to a real low level hardware systems/chip engineer they'd laugh and take another shot at how we put them on a pedestal

girvo 5 days ago

Not really, in my experience. There's still fundamental differences between designed systems and trained LLMs.