I agree. The "container" intuition for Monads leaves you stuck when you try to contemplate IO (or even Promises, these days), because the "bind" operator looks like it does something impossible: extract "the" `a` from the `IO a`, when you have no idea what it is. (Trust me, I spent a long time stuck at this point.) Better to think of Monad as "Applicative + join" (you need Applicative to get `pure`).
If you think of Monads in terms of `fmap` + `join :: Monad m => m (m a) -> m a`, then you don't need to imagine an "extraction" step and your intuition is correct across more instances. Understanding `join` gives you an intuition that works for all the monads I can think of, whereas the container intuition only works for `Maybe` or `Either e` (not even `[]`, even though it _is_ a container). You can define each of `>>=`/`join`/`>=>` in terms of `pure` + any of the other two, and it is an illuminating exercise to do so. (That `class Monad` defines `>>=` as its method is mostly due to technical GHC reasons rather than anything mechanical.)
I prefer the "join" approach for beginners too, but >>= has become so pervasive that I feel bad trying to explain it that way. Turning people loose on monad-heavy code with that approach still leaves them needing to convert their understanding into >>= anyhow.
One does wonder about the alternate world where that was the primary way people interacted with it.
I think you don't have to teach people to program with `join`, but just that `m >>= f = join (fmap f) m`. It explains away the "get a value out" question but teaches the most common function from the interface.