benhurmarcel 4 days ago

I find that Gemini 2.5 Pro tends to produce working but over-complicated code more often than Claude 3.7.

1
torginus 4 days ago

Which might be a side-effect of the reasoning.

In my experience whenever these models solve a math or logic puzzle with reasoning, they generate extremely long and convoluted chains of thought which show up in the solution.

In contrast a human would come up with a solution with 2-3 steps. Perhaps something similar is going on here with the generated code.