What limitations specifically are you referring to?
The packet size is a major one. The lack of larger packets leads to nonsense like the "freshness manager" in things like AUTOSAR's SecOC, or the addressing scheme. Every subsequent CAN extension has tried to rectify both of these in different ways and inevitably failed, which leads to the next layer up the networking stack reinventing the wheel badly. Eventually you end up with UDS.
Yea, that 64-byte frame size. In practice, I've always seen it abstracted away into a layer on top, but if you're working low-level (e.g. implementing that layer), it's a pain. So, a given packet may be represented by multiple frames.
I hold a patent on the design of a hardware offload engine to hide the handling of multiple frames from a main CPU.