> Amen. Seriously. They're tools. Sometimes they work wonderfully. Sometimes, not so much. But I have DEFINITELY found value. And I've been building stuff for over 15 years as well.
Yes, but these lax expectation s are what I don't understand.
What other tools in software sometimes work and sometimes don't that you find remotely acceptable? Sure all tools have bugs, but if your compiler had the same failure rate and usability issues as an LLM you'd never use it. Yet for some reason the bar is so low for LLMs. It's insane to me how much people have indulged in the hype koolaid around these tools.
> What other tools in software sometimes work and sometimes don't that you find remotely acceptable?
Other people.
Seriously, all that advice about not anthropomorphizing computers is taken way too seriously now, and is doing a number on the industry. LLMs are not a replacement for compilers or other "classical" tools - they're replacement for people. The whole thing that makes LLMs useful is their ability to understand what some text means - whether or not it's written in natural language or code. But that task is inherently unreliable because the problem itself is ill-specified; the theoretically optimal solution boils down to "be a simulated equivalent of a contemporary human", and that still wouldn't be perfectly reliable.
LLMs are able to trivially do tasks in programming that no "classical" tools can, tasks that defy theoretical/formal specification, because they're trained to mimic humans. Plenty of such tasks cannot be done to the standards you and many others expect of software, because they're NP-complete or even equivalent to halting problem. LLMs look at those and go, "sure, this may be provably not solvable, but actually the user meant X therefore the result is Y", and succeed with that reliably enough to be useful.
Like, take automated refactoring in dynamic languages. Any nontrivial ones are not doable "classically", because you can't guarantee there aren't references to the thing you're moving/renaming that are generated on the fly by eval() + string concatenation, etc. As a programmer, you may know the correct result, because you can understand the meaning and intent behind the code, the conceptual patterns underpinning its design. DAG walkers and SAT solvers don't. But LLMs do.
People are way too quick to defend LLMs here, because it's exactly on point.
In an era where an LLM can hallucinate (present you a defect) with 100% conviction, and vibe coders can ship code of completely unknown quality with 100% conviction, the bar by definition has to have been set lower.
Someone with experience will still bring something more than just LLM-written code to the table, and that bar will stay where it is. The people who don't have experience won't even feel the shortcomings of AI because they won't know what it's getting wrong.
> In an era where an LLM can hallucinate (present you a defect) with 100% conviction, (...)
I think you're trying very hard to find anything at all to criticize LLMs and those who use them, but all you manage to come up with is outlandish, "grasping at straws" arguments.
Yes, it's conceivable that LLMs can hallucinate. How often do they do, though? In my experience, not that much. In the rare cases they do, it's easy to spot and another iteration costs you a couple of seconds to get around it.
So, what are you complaining about, actually? Are you complaining about LLMs or just letting the world know how competent are you at using LLMs?
> Someone with experience will still bring something more than just LLM-written code to the table, and that bar will stay where it is.
Someone with experience leverages LLMs to do the drudge work, and bump up their productivity.
I'm not sure you fully grasp the implications. You're rehashing the kind of short-sighted comments that in the past brought comically-clueless assertions such as "the kids don't know assembly, so how can they write good programs". In the process, you are failing to understand the fact that the way software is written has already changed completely. The job of a developer is no longer typing code away and googling for code references. Now we can refactor and rewrite entire modules, iterate over the design, try a few alternative approaches, pin alternatives against each other, and pick up the one we prefer to post a PR. And then go to lunch. With these tools, some of your "experienced" developers turn out to be not that great, whereas "inexperienced" ones outperform them easily. How do you deal with that?
There were always lots of code generation tools that people expected to review and fix the output.
Anyway, code generation tools almost always are born unreliable, then improve piecewise into almost reliable, and finally get replaced by something with a mature and robust architecture that is actually reliable. I can't imagine how LLMs could traverse this, but I don't think it's an extraordinary idea.
My compiler doesn’t write a complete function to visualize a DataFrame based on a vague prompt. It also doesn’t revise that function as I refine the requirements. LLMs can.
There’s definitely hype out there, but dismissing all AI use as “koolaid” is as lazy as the Medium posts you’re criticizing. It’s not perfect tech, but some of us are integrating it into real production workflows and seeing tangible gains, more code shipped, less fatigue, same standards. If that’s a “low bar,” maybe your expectations have shifted.
> What other tools in software sometimes work and sometimes don't that you find remotely acceptable?
Searching for relevant info on the Internet can take several attempts, and occasionally I end up not finding anything useful.
My ide intellisense tries to guess what identifier I want and put it at the top of the list, sometimes it guesses wrong.
I've heard that the various package repositories will sometimes deliberately refuse to work for a while because of some nonsense called "rate limiting".
Cloud deployments can fail due to resource availability.
> Yes, but these lax expectation s are what I don't understand.
It's pretty a really, really simple concept.
If I have a crazy Typescript error, for instance, I can throw it in and get a much better idea of what's happening. Just because that's not perfect, doesn't mean it isn't helpful. Even if it works 90% of the time, it's still better than 0% of the time (Which is where I was at before).
It's like google search without ads and with the ability to compose different resources together. If that's not useful to you, then I don't know what to tell you.
Hell, AI is probably -1x for me because I refuse to give up and do it myself instead of trying to get the robots to do it. I mean, writing code is for the monkeys, right?
Anyhoo... I find that there are times where you have to really get in there and question the robot's assumptions as they will keep making the same mistake over and over until you truly understand what it is they are actually trying to accomplish. A lot of times the desired goal and their goal are different enough to cause extreme frustration as one tends to think the robot's goal should perfectly align with the prompt. Once it fails a couple times then the interrogation begins since we're not making any further progress, obviously.
Case in point, I have this "Operational Semantics" document, which is correct, and a peg VM, which is tested to be correct, but if you combine the two one of the operators was being compiled incorrectly due to the way backtracking works in the VM. After Claude's many failed attempts we had a long discussion and finally tracked down the problem to be something outside of its creative boundaries and it needed one of those "why don't you do it this way..." moments. Sure, I shouldn't have to do this but that's the reality of the tools and, like they say, "a good craftsman never blames his tools".