Item 42230987

As a general principle, absolutely.

In practice, I wonder what size of file we're talking about that would result in net compression on random data 50% of the time?

I have no intuition whether it's 1 GB, 1 TB, 1 PB, or beyond.

SideQuark • 4 days ago

Nope. As your file gets longer so does the data needed to specify where the repeats are. See my other estimate in this thread. It fails spectacularly.