johnisgood 12 hours ago

Yeah, I would have liked to see more examples and an explanation.

1
evmar 12 hours ago

I glanced at the code. It seems to have multiple possible "passes" which reduce the code in various ways, and the passes here not tagged with "C"=>1 are used in the not-c mode recommended in the post.

https://github.com/csmith-project/creduce/blob/31e855e290970...

You can see the pass implementation in files on the left.

wat10000 11 hours ago

The key thing is that the transforms it makes aren’t required to always produce a program that can actually run, or even build.

The user provides a script which determines if a program is “interesting.” A program with build-time errors should be considered “not interesting” in most cases. (And if you’re hunting an incorrectly reported error, you’ll want to make sure you only consider that error to be interesting.)

Then it yoinks out parts of your program and checks if the result is still interesting. If it’s not, that attempt is discarded and it tries something else. If the reduced program is still interesting, it will try yoinking out more stuff. Repeat until you like the result.

There doesn’t need to be any understanding of the program in order for this to work. You just need something where removing some bit of the program has a chance of producing a new program that’s still interesting. That works in most programming languages.

This process can take a long time, and it’s slower when there’s a lower probability of a removal producing an interesting program. Thus heuristics are added: don’t remove random character ranges, but work at a token level. When removing a brace, find the matching brace and remove the insides. That sort of thing. Most modern languages are similar enough to C that many of these rules are helpful for them too. And even if they don’t help, this just slows the process down, but it still works.