Item 43999298

"Nobody in CS is approaching the computer as an entirely black box and making up how they think or hope it works."

That is literally how we approach transformers.

Who is "we"? Lot's of people (including me) know how how transformers work. Just because we can't do all the math in our head quickly enough to train a model or run inference mentally, doesn't mean we don't know mechanically how they work.

1 reply

ImHereToVote • 1 day ago

We know how they are trained. We just don't know how the trained model works, since the program is emergent.

1 reply

danielmarkbruce • 22 hours ago

Lol, we also know how inference works. The fact that LLMs turned about to be surprisingly effective doesn't mean we don't know how it works. There are many fields where we know the underlying physics and it's just difficult to actually predict real world results because there are so many calculations. What's next, you are going to tell an aerospace engineer that flight is "emergent" because we need to run simulations or experiments?

1 reply

ImHereToVote • 18 hours ago

The actual program is a black box. We have been able to dissect some details but the whole system is hard to understand. The program is grown more than developed. Understanding the concept of inference doesn't help you much.

1 reply

danielmarkbruce • 17 hours ago

"the program" is some silly abstraction you've made up. If you don't understand the underlying mathematical operations that's fine, but many of use do. And they aren't that complicated in the grand scheme.

Every complex system is hard to understand due to the number of variables v human working memory.

1 reply

ImHereToVote • 5 hours ago

This is like saying that understanding water phase changes makes you competent at ice skating. You know what I'm talking about.