This greatly opens up the risk of footguns.
The doc [1] warns about prompt injection, but I think a more likely scenario is self-inflicted harm. For instance, you give a tool access to your brokerage account to automate trading. Even without prompt injection, there's nothing preventing the bot from making stupid trades.
> This greatly opens up the risk of footguns.
Yeah, it really does.
There are so many ways things can go wrong once you start plugging tools into an LLM, especially if those tool calls are authenticated and can take actions on your behalf.
The MCP world is speed-running this right now, see the GitHub MCP story from yesterday: https://news.ycombinator.com/item?id=44097390
I stuck a big warning in the documentation and I've been careful not to release any initial tool plugins that can cause any damage - hence my QuickJS sandbox one and SQLite plugin being read-only - but it's a dangerous space to be exploring.
(Super fun and fascinating though.)
If you hook an llm up to your brokerage account, someone is being stupid, but it ain't the bot.
You think "senior leadership/boards of directors" aren't thinking of going all in with AI to "save money" and "grow faster and cheaper"?
This is absolutely going to happen at a large scale and then we'll have "cautionary tales" and a lot of "compliance" rules.
Yes, sandboxing will be crucial. On macOS it's not that hard, but there aren't good easy to use tools available for it right now. Claude Code has started using Seatbelt a bit to optimize the UX.
I think the whole footgun discussion misses the point. Yes, you can shoot yourself in the foot (and probably will), but not evaluating the possibilities is also a risk. Regular people tend to underestimate the footgun potential (probably driven by fear of missing out) and technical people tend to underestimate the risk of not learning the new possibilities.
Even a year ago I let LLMs execute local commands on my laptop. I think it is somewhat risky, but nothing harmful happened. You also have to consider what you are prompting. So when I prompt 'find out where I am and what weather it is going to be', it is possible that it will execute rm -rf / but very unlikely.
However, speaking of letting an LLMs trade stocks without understanding how the LLM will come to a decision... too risky for my taste ;-)
Any tool can be misused
This is not misuse. This is equivalent to a driller that in some cases drills the hand holding it.
You're missing the point. Most tools are deployed by humans. If they do something bad, we can blame the human for using the tool badly. And we can predict when a bad choice by the human operator will lead to a bad outcome.
Letting the LLM run the tool unsupervised is another thing entirely. We do not understand the choices the machines are making. They are unpredictable and you can't root-cause their decisions.
LLM tool use is a new thing we haven't had before, which means tool misuse is a whole new class of FUBAR waiting to happen.
But why can we not hold humans responsible in the case of LLM? You do have to go out of your way to do all of these things with an LLM. And it is the human that does it. It is the humans that give it the permission to act on their behalf. We can definitely hold humans responsible. The question is: are we going to?
I think intent matters.
Let's say you are making an AI-controlled radiation therapy machine. You prompt and train and eval the system very carefully, and you are quite sure it won't overdose any patients. Well, that's not really good enough, it can still screw up. But did you do anything wrong? Not really, you followed best practices and didn't make any mistakes. The LLM just sometimes kills people. You didn't intend that at all.
I make this point because this is already how these systems work today. But instead of giving you a lethal dose of radiation, it uses slurs or promotes genocide or something else. The builders of those bots didn't intend that, and in all likelihood tried very hard to prevent it. It's not very fair to blame them.
> But did you do anything wrong? Not really, you followed best practices and didn't make any mistakes. The LLM just sometimes kills people. You didn't intend that at all.
Whoever made and/or (mis)use the LLMs should be liable for it, regardless of intent.
If I shot someone which I did not intend to, do I get away with it? No, I would not get away with it.
You did something wrong: non-deterministic impossible to validate process for critical system.
What I am trying to say is that humans absolutely can be held responsible. Do you disagree?
> But did you do anything wrong?
Yes, you had a deontological blindspot that prevented you from asking, "What are some of the high-risk consequences, and what responsibility do I have to these consequences of my actions?"
Deontology's failure case is this ethically naive version that doesn't include a maxim that covers the "Sometimes peoples' reliance on good intentions relieves them of exerting the mental energy of considering the consequences which ultimately cause those consequences to exist"-situation.
One of the assumptions that bears arguing against is that the choice is framed as happening once before the system is deployed. However, this oversimplification has an unfortunate side effect - that we can't know all of the consequences upfront, so sometimes we're surprised by unforeseen results and that we shouldn't hold people to the standard of needing to accurately predict the future.
In real life, though, the choices are rarely final. Even this one, deploying the genocide-promoting LLM is reversible. If you didn't predict that the LLM promotes genocide, and then you find out it does in fact promote genocide, you don't throw up your hands and says, "I hadn't thought of this, but my decision is final." No, armed with the information, you feel a sense of duty to minimize the consequences by shutting it down and fixing it before deploying it again.
Further more, you take a systems level approach and ask, "How could we prevent this type of error in the future?" Which ends with "We will consider the consequences of our actions to a greater degree in the future," or perhaps even, "We don't have enough foresight to be making these decisions and we will stop making medical devices."
The point is that distilling the essence of the situation down to "good intentions absolve responsibility," or "one drop of involvement implies total responsibility," isn't really how people think about these things in real life. It's the spherical cow of ethics - it says more about spheres than it does about cows.
> "good intentions absolve responsibility"
Yeah, I call it bullshit. I accidentally shot someone yet I did not intend to. I am still being liable for it.