Item 43679074

jillesvangurp • 7 days ago

Any interpreter can run malicious code. Mostly the guidance is: don't run malicious code if you don't want it to run. The problem isn't the interpreter/tool but the entity that's using it. Because that's the thing that you should be (mis)-trusting.

The issue is two fold:

- models aren't quite trustworthy yet.

- people put a lot of trust in them anyway.

This friction always exist with security. It's not a technical problem that can or should be solved on the MCP side.

Part of the solution is indeed going to come from containerization. Give MCP agents access to what they need but not more. And part of it is going to come from some common sense and the tool UX providing better transparency into what is happening. Some of the better examples I've seen of Agentic tools work like you outline.

I don't worry too much about the cost. This stuff is getting useful enough that paying a chunk of what normally would go into somebody's salary actually isn't that bad of a deal. And of course cost will come down. My main worry is actually speed. I seem to spend a lot of time waiting for these tools to do their thing. I'd love this stuff to be a bit zippier.

jeswin • 7 days ago

> Give MCP agents access to what they need but not more.

My view is that you should give them (Agents) a computer, with a complete but minimal Linux installation - as a VM or Containerized. This has given me better results, because now it can say fetch information from the internet, or do whatever it wants (but still in the sandbox). Of course, depending on what you're working on, you might decide that internet access is a bad idea, or that it should just see the working copy, or allow only certain websites.

2 replies

jacobr1 • 6 days ago

If you give it access to the internet ... it can basically do anything, exfil all your code, receive malicious instructions. The blast radius (presuming it doesn't get out of your sandbox) is limited to loss of whatever your put in (source code) and theft of resources (running a coinminer, host phishing attacks, etc ...). As you say, you can limit things to trusted websites which helps .. but even then, if you trust, say github, anyone can host malicious instructions. The risk tradeoffs (likelihood of of hitting malicious instruction, vs productivity benefit) might nevertheless be worth it ... not to much targetted maliciousness in wild yet. And just a bit more gaurdrailing and logging can go a long way.

ycombinatrix • 6 days ago

>now it can say fetch information from the internet...(but still in the sandbox)

If it is talking to the internet, it is most definitely not sandboxed.

peterlada • 7 days ago

Let me give you some contrast here:

- employees are not necessarily trustworthy

- employers place a lot of trust in them anyway

1 reply

mdaniel • 6 days ago

This argument comes up a lot, similar to the "humans lie, too" line of reasoning

The difference in your cited case is that employees are a class of legal person which is subject to the laws of the jurisdiction in which they work, along with any legal contracts they signed as a condition of their employment. So, that's a shitload of words to say "there are consequences" which isn't true of a bunch of matrix multiplications that happen to speak English and know how to invoke RPCs