I find that Gemini 2.5 Pro tends to produce working but over-complicated code more often than Claude 3.7.
Which might be a side-effect of the reasoning.
In my experience whenever these models solve a math or logic puzzle with reasoning, they generate extremely long and convoluted chains of thought which show up in the solution.
In contrast a human would come up with a solution with 2-3 steps. Perhaps something similar is going on here with the generated code.