Some amount of buffering is inevitable in any packet-switched network, even in a pure Ethernet network with negligible packet loss/corruption, due to the nature of packet switching statistics in general and the TCP congestion control algorithm in particular [1].
The problem with these terrible CPEs is that their buffers are generally much too large, but just making them smaller isn't trivial either – the "correct" size, as suggested by queueing theory, is equal to the product of bandwidth and end-to-end latency, but the latter is impossible to know to a router and varies from connection to connection!
The key insight of CoDeL is that transient queues are good (because they avoid packets dropped due to transient congestion), but standing ones are bad (because they avoid nothing and just increase latency). I believe it's now fortunately mandatory in newer DOCSIS CPEs (but I have a lot of faith in CPE manufacturers to somehow still botch it).
[1] In fact, negligible packet loss for reasons other than congestion is a critical assumption for at most older TCP congestion control algorithms!
Sure, but the problem really was the lack of upstream bandwidth meant to handle the acknowledgements at the time. It's not so much "Should I buffer or should I not?" as it was more about "What is the proper amount to upstream bandwidth to allow people to properly let them use their downstream bandwidth?"
One could tell it was the cable modem buffering quite easily by watching ping times to an external ip. When the buffer got full. Pings which should have been dropped came in maybe 4, 5, or 6 seconds later.
Only by eliminating the buffer entirely in the cable company provided cable modem allowed TCP congestion algorithms to work correctly. UDP may suffer, sure, but I didn't really care about UDP packets back then. I cared more about browsing and downloads.