Item 43684607

modeless • 5 days ago

Huh, seems like Aider made a special mode specifically for Gemini[1] some time after Google's announcement blog post with official performance numbers. Still not sure it makes sense to quote that new score next to the others. In any case Gemini's 69% is the top score even without a special mode.

[1] https://aider.chat/docs/more/edit-formats.html#diff-fenced:~...

jsnell • 5 days ago

The mode wasn't added after the announcement, Aider has had it for almost a year: https://aider.chat/HISTORY.html#aider-v0320

This benchmark has an authoritative source of results (the leaderboard), so it seems obvious that it's the number that should be used.

1 reply

modeless • 5 days ago

OK but it was still added specifically to improve Gemini and nobody else on the leaderboard uses it. Google themselves do not use it when they benchmark their own models against others. They use the regular diff mode that everyone else uses. https://blog.google/technology/google-deepmind/gemini-model-...