antirez 3 days ago

In complicated code I'm developing (Redis Vector Sets) I use both Claude 3.7 and Gemini 2.5 PRO to perform code reviews. Gemini 2.5 PRO can find things that are outside Claude abilities, even if Gemini, as a general purpose model, is worse. But It's inherently more powerful at reasoning on complicated code stuff, threading, logical errors, ...

1
larodi 3 days ago

Is this to say that you're writing the code manually and having the model verify for various errors, or also employing the model for actual code work.

Do you instruct the code to write in "your" coding style?

antirez 2 days ago

For Vector Sets, I decided to write all the code myself, and I use the models very extensively for the following three goals:

1. Design chats: they help a lot as a counterpart to detect if there are flaws in your reasoning. However all the novel ideas in Vector Sets were consistently found by myself and not by the models, they are not there yet.

2. Writing tests. For the Python test code, I let the model write it, under very strict prompts explaining very well what a given test should do.

3. Code reviews: this saved myself and future users a lot of time, I believe.

The way I used the model to write C code was to write throw away programs in order to test if certain approaches could work: benchmarks, verification programs for certain invariants, and so forth.

larodi 1 day ago

Insightful

I personally tried long runs with say writing a plugin for QGIS, but then I found it is better to actually personally write some parts of the code, so to remember it. Also advancing with smaller chunks seems to result in less iterations.

Besides, indeed, the whole concept seems to not work so well with ingenious stuff. The model simply fails to understand unless lots of explaining.

The LLM assisted tech writing though seems to benefit a lot from the cursor/cline approach. Here, more than anywhere else, a careful review is also needed.