On Zen 3, I am able to use nearly the full 51.2GB/sec from a single CPU core. I have not tried using two as I got so close to 51.2GB/sec that I had assumed that going higher was not possible. Off the top of my head, I got 49-50GB/sec, but I last measured a couple years ago.
By the way, if the cores were able to load things at full speed, they would be able to use 640GB/sec each. That is 2 AVX-512 loads per cycle at 5GHz. Of course, they never are able to do this due to memory bottlenecks. Maybe Intel’s Xeon Max series with HBM can, but I would not be surprised to see an unadvertised internal bottleneck there too. That said, it is so expensive and rare that few people will ever run code on one.
People have studied the Xeon Max! Spoiler - yes, it's limited to ~23GB/s per core. It can't achieve anywhere close to the theoretical bandwidth of the HBM even, with all cores active. It's a pretty bad design in my opinion.
https://www.ixpug.org/images/docs/ISC23/McCalpin_SPR_BW_limi...
It is integer factors better overall total BW than ddr5 spr; I think they went for minimal investment + time to market for the spr w/ hbm product rather than heavy investment to hit full bw utilization. Which may have made sense for intel overall given business context etc