If a script were applied that corrected "bad math" and now the LLM could solve complex math problems that you can't one-shot throw at a calculator, what would you call it?
It's a good point.
But this math analogy is not quite appropriate: there's abstract math and arithmetic. A good math practitioner (LLM or human) can be bad at arithmetic, yet good at abstract reasoning. The later doesn't (necessarily) requires the former.
In chess, I don't think that you can build a good strategy if it relies on illegal moves, because tactics and strategies are tied.
If I had wings, I'd be a bird.
Applying a corrective script to weed out bad answers is also not "one-shot" solving anything, so I would call your example an elaborate guessing machine. That doesn't mean it's not useful, but that's not how a human being does maths, when they understand what they're doing - in fact you can readily program a computer to solve general maths problems correctly the first time. This is also exactly the problem with saying that LLMs can write software - a series of elaborate guesses is undeniably useful and impressive, but without a corrective guiding hand, ultimately useless, and not demonastrating generalised understanding of the problem space. The dream of AI is surely that the corrective hand is unnecessary?
Then you could replace the LLM with a much cheaper RNG and let it guess until the "bad math filter" let something through.
I was once asked by one of the Clueless Admin types if we couldn't just "fix" various sites such that people couldn't input anything wrong. Same principle.