vunderba 4 days ago

Story time. In our Data Structures and Algorithms class during my undergrad, we had a similar challenge posed to us by an associate professor who was quite proud of the fact that it had never been beaten.

The sophistication of my compression understanding at the time extended to lempel-ziv and huffman and well... that's about it. So I was like, “Okay, I’m not going to create the world’s best compression method on a whim.” So I started thinking: if I can’t compress the file, maybe I can just fetch it from somewhere else.

The first thing I tried was writing a program that searches online by uploading the file onto a public FTP. From there, I could fetch it. Simple enough, right?

Well, here’s the issue with that: there’s no guarantee I’d even have internet connectivity. That’s probably the first thing he’d block.

So, he gives us the file. I think it was something like 10 MB, and when I looked through it, the entropy was high. Like, really high. Standard compression tools like 7Z or rar wanted NOTHING to do with it.

Then I started thinking again. Maybe this guy’s running all the submissions back-to-back. He’s probably processing them in sequence because he’s got tons of entries coming in. So I thought: if I can’t download the file, maybe there's a chance it's already on the machine from somebody's previous simulation - I can find it somewhere on the hard drive. My program might not be a great decompression system, but you know what it is? A fantastic file search.

So, I threw together a program that tree searched the entire disk with a simple checksum to verify I’d found the exact right file. If it matched, I’d just copy the file into the current directory and say, “I’m done.” - though if it found the file too quickly it would continue idling for a while so as to avoid suspicion. (I was also banking on the fact that someone else’s program might’ve run before mine, leaving the file lying around somewhere for me to grab.)

Of course, I couldn’t make this look too obvious. I mean, imagine submitting a tiny C++ program claiming it could compress a 10 MB file into just a few kilobytes. He’d immediately know it was fake.

So, I padded the compressed binary, adding a bunch of random hexadecimal and binary junk to bulk it up to around 5 MB. It was still phenomenal—better than any other method on the market for this type of file—but at least it seemed somewhat plausible. I mean, maybe I’d "discovered" some revolutionary new compression algorithm or something and smashed the weissman score.

The contest went on for nearly a month, and the guy had a leaderboard tracking scores. Most people were hovering just slightly above standard compression rates—around 9 MB. Nothing groundbreaking, though.

So, it’s a Friday night. I finish the program, pack it all up, and send it off. Then I go to bed. Needless to say, I woke up with a quite an excited response from my professor.

2
3eb7988a1663 3 days ago

AND?! What was his response once you had to come clean?

Edit: I loved this (minus the missing ending). Really sucked me in with the level of detail.

alt227 3 days ago

Come on dude, great story, missing ending! Fill us in please.