Sorry, but isn’t this a bit ridiculous? Who just allows the AI to add code without reviewing it? And who just allows that code to be merged into a main branch without reviewing the PR?
They start out talking about how scary and pernicious this is, and then it turns out to be… adding a script tag to an html file? Come on, as if you wouldn’t spot that immediately?
What I’m actually curious about now is - if I saw that, and I asked the LLM why it added the JavaScript file, what would it tell me? Would I be able to deduce the hidden instructions in the rules file?
There are people who do both all the time, commit blind and merge blind. Reasonable organizations have safeguards that try and block this, but it still happens. If something like this gets buried in a large diff and the reviewer doesn't have time, care, or etc, I can easily see it getting through.
The script tag is just a PoC of the capability. The attack vector could obviously be used to "convince" the LLM to do something much more subtle to undermine security, such as recommending code that's vulnerable to SQL injections or that uses weaker cryptographic primitives etc.
Of course, but this doesn’t undermined the OPs point of „who allows the AI to do stuff without reviewing it“. Even WITHOUT the „vulnerability“ )if we call it that), AI may always create code that may be vulnerable in some way. The vulnerability certainly increases the risk a lot and hence is a risk and also should be addressed in text files showing all characters, but AI code always needs to be reviewed - just as human code.
The point is this: vulnerable code often makes it to production, despite the best intentions of virtually all people writing and reviewing the code. If you add a malicious actor standing on the shoulder of the developers suggesting code to them, it is virtually certain that you will increase the amount of vulnerable and/or malicious code that makes it into production, statistically speaking. Sure, you have methods to catch much of these. But as long as your filters aren't 100% effective (and no one's filters are 100% effective), then the more garbage you push through them, the more garbage you'll get out.
the OPs point about who allows the AI to do stuff without reviewing it is undermined by reality in multiple ways
1. a dev may be using AI and nobody knows, and they are trusted more than AI, thus their code does not get as good a review as AI code would.
2. People review code all the time and subtle bugs creep in. It is not a defense against bugs creeping in that people review code. If it were there would be no bugs in organizations that review code.
3. people may not review or look only for a second based on it's a small ticket. They just changed dependencies!
more examples left up to reader's imagination.
Way too many "coders" now do that. I put the quotes because I automatically lose respect over any vibe coder.
This is a dystopian nightmare in the making.
At some point only a very few select people will actually understand enough programming, and they will be prosecuted by the powers that be.
Oh man don’t even go there. It does happen.
AI generated code will get to production if you don’t pay people to give a fuck about it or hire people who don’t give a fuck.
It will also go in production because this is the most efficient way to produce code today
Only if you don't examine that proposition at all.
You still have to review AI generated code, and with a higher level of attention than you do most code reviews for your peer developers. That requires someone who understands programming, software design, etc.
You still have to test the code. Even if AI generates perfect code, you still need some kind of QA shop.
Basically you're paying for the same people to do similar work to what they do now, but now you also paying for an enterprise license to your LLM provider of choice.
Sure, if you don't care about quality you can put out code really fast with LLMs. But if you do care about quality, they slow you down rather than speed you up.
It depends somewhat on how tolerant your customers are of shite.
Literally all I’ve seen is stuff that I wouldn’t ship in a million years because of the potential reputational damage to our business.
And I get told a lot by people who really have no idea what they are doing clearly that it’s actually good.