>It's easy to show that in practice some bytes are more common than others (because random)
I don’t follow. Wouldn’t that be (because not random)
If you generate a billion bytes using a random byte generator, and bin the resultant array into 256 bins, it will not be perfectly flat. You can use that non-flatness to encode your bits more efficiently. I suspect just using codes to do it won't work well because the bin values are so close so you'll struggle to get codes that are efficient enough, but I suspect you can use the second order difference-between-specific byte as the encoded value. That has a much more pronounced distribution heavily weighted to small values.