cluckindan 3 days ago

I agree. I tried something similar: a conversion of a simple PHP library from one system to another. It was only like 500 loc but Gemini 2.5 completely failed around line 300, and even then its output contained straight up hallucinations, half-brained additions, wrong namespaces for dependencies, badly indented code and other PSR style violations. Worse, it also changed working code and broke it.

2
stavros 3 days ago

Try asking it to generate a high-level plan of how it's going to do the conversion first, then to generate function definitions for the new functions, then have it generate tests for the new functions, then actually write them, while giving it the output of the tests.

It's not like people just one-shot a whole module of code, why would LLMs?

chrismorgan 3 days ago

> It's not like people just one-shot a whole module of code, why would LLMs?

For conversions between languages or libraries, you often do just one-shot it, writing or modifying code from start to end in order.

I remember 15 years ago taking a 10,000 line Java code base and porting it to JavaScript mostly like this, with only a few areas requiring a bit more involved and non-sequential editing.

dietr1ch 1 day ago

I think this shows how the approach LLMs take is wrong. For us it's easy because we simply sort of iterate over every function with a simple prompt of doing a translation, but are yet careful enough taking notes of whatever may be relevant to do a higher level change if necessary.

Maybe the mistake is mistaking LLMs as capable people instead of a simple, but optimised neuron soup tuned for text.

copperx 2 days ago

So, you didn't test it until the end? or did you have to build it in such a way that is was partially testable?

chrismorgan 1 day ago

One of the nifty things about the target being JavaScript was that I didn’t have to finish it before I could run it—it was the sort of big library where typical code wouldn’t use most of the functionality. It was audio stuff, so there were a couple of core files that needed more careful porting (from whatever in Java to Mozilla’s Audio Data API, which was a fairly good match), and then the rest was fairly routine that could be done gradually, as I needed them or just when I didn’t have anything better to focus on. Honestly, one of the biggest problems was forgetting to prefix instance properties with `this.`

semi-extrinsic 3 days ago

I know many people who can and will one-shot a rewrite of 500 LOC. In my world, 500 LOC is about the length of a single function. I don't understand why we should be talking about generating a high level plan with multiple tests etc. for a single function.

And I don't think this is uncommon. Just a random example from Github, this file is 1800 LOC and 4 functions. It implements one very specific thing that's part of a broader library. (I have no affiliation with this code.)

https://github.com/elemental/Elemental/blob/master/src/optim...

stavros 3 days ago

> I don't understand why we should be talking about generating a high level plan with multiple tests etc. for a single function.

You don't have to, you can write it by hand. I thought we were talking about how we can make computers write code, instead of humans, but it seems that we're trying to prove that LLMs aren't useful instead.

SpaceNoodled 2 days ago

No, it's simply being demonstrated that they're not as useful as some claim.

stavros 2 days ago

By saying "why do I have to use a specific technique, instead of naively, to get what I want"?

SpaceNoodled 2 days ago

"Why do I have to put in more work to use this tool vs. not using it?"

stavros 2 days ago

Which is exactly what I said here:

https://news.ycombinator.com/item?id=43537443

semi-extrinsic 2 days ago

If we have to break the problem into tiny pieces that can be individually tested in order for LLMs to be useful, I think it clearly limits LLM usability to a particular niche of programming.

KronisLV 2 days ago

> If we have to break the problem into tiny pieces that can be individually tested

Isn't this something that we should have doing for decades of our own volition?

Separation of concerns, single responsibility principle, all of that talk and trend of TDD or at the very least having good test coverage, or writing code that at least can be debugged without going insane (no Heisenbugs, maybe some intermediate variables to stop on in a debugger, instead of just endless chained streams, though opinions are split, at least code that is readable and not 3 pages worth per function).

Because when I see long bits of code that I have to change without breaking anything surrounding them, I don't feel confident in doing that even if it's a codebase I'm familiar with, much less trust an AI on it (at that point it might be a "Hail Mary", a last ditch effort in hoping that at least the AI can find method in the madness before I have to get my own hands dirty and make my hair more gray).

stavros 2 days ago

You don't have to, the LLM will.

SpaceNoodled 2 days ago

Only 500 lines? That's miniscule.

blensor 2 days ago

Did you paste it into the chat or did you use it with a coding agent like Cline?

I am majorly impressed with the combination VSCode + Cline + Gemini

Today I had it duplicate an esp32 proram from UDP communication to TCP.

It first copied the file ( funnily enough by writing it again instead of just straight cp ) Then it started to just change all the headers and declarations Then in a third step it changed one bigger function And in the last step it changed some smaller functions

And it reasoned exactly that way "Let's start with this first ... Let's now do this .... " until is was done

ionwake 2 days ago

I’ve just moved from expensive claudecode to cursor and Gemini - what are you thoughts on cursor vs cline?

Thank you