You can always narrow down a capability (get a new capability pointing to a subdirectory or file, or remove the writing capability so it is read only) but never make it more broad.
In a system designed for this it will be used for everything, not just file system. You might have capabilities related to network connections, or IPC to other processes, etc. The latter is especially attractive in microkernel based OSes. (Speaking of which, Redox OS seems to be experimenting with this, just saw an article today about that.)
See also https://en.wikipedia.org/wiki/Capability-based_security
Admittedly, there’s a little more friction and agent confusion sometimes with this setup, but it’s worth the benefit of having zero worries about permissions and security.
Wrong layer. You want the deletion to actually be impossible from a privilege perspective, not be made practically harder to the entity that shouldn't delete something.
Claude definitely knows how to reimplement `rm`.
I have seen the AI break out of (my admittedly flimsy) guards, like doing simply
safepath/../../stuff or something even more convoluted like symlinks.
That would make it far less useful in general.
1) more guardrails in place
2) maybe more useful error messages that would help LLMs
3) no friction with needing to get any patches upstreamed
External tool calling should still be an option ofc, but having utilities that are usable just like what's in the training data, but with more security guarantees and more useful output that makes what's going on immediately obvious would be great.
But that's also the most damaging actions it could take. Everything on my computer is backed up, but if Claude insults my boss, that would be worse.
Oh, I'm totally not arguing for cutting off other capabilities, I like tool use and find it to be as useful as the next person!
Just that the shell tools that will see A LOT of usage have additional guardrails added on top of them, because it's inevitable that sooner or later any given LLM will screw up and pipe the wrong thing in the wrong command - since you already hear horror stories about devs whose entire machines get wiped. Not everyone has proper backups (even though they totally should)!
And when that fails for some reason it will happily write and execute a Python script bypassing all those custom tools
I feel like an integration with bubblewrap, the sandboxing tech behind Flatpak, could be useful here. Have all executed commands wrapped with a BW context to prevent and constrain access.
> These restrictions are enforced at the OS level (Seatbelt on macOS, bubblewrap on Linux), so they apply to all subprocess commands, including tools like kubectl, terraform, and npm, not just Claude’s file tools.
They look a lot like daemons to me, they're a program that you want hanging around ready to respond, and maybe act autonomously through cron jobs are similar. You want to assign any number of permissions to them, you don't want them to have access to root or necessarily any of your personal files.
It seems like the permissions model broadly aligns with how we already handle a lot of server software (and potentially malicious people) on unix-based OSes. It is a battle-tested approach that the agent is unlikely to be able to "hack" its way out of. I mean we're not really seeing them go out onto the Internet and research new Linux CVEs.
Have them clone their own repos in their own home directory too, and let them party.
Openclaw almost gets there! It exposes a "gateway" which sure looks like a daemon to me. But then for some reason they want it to live under your user account with all your privileges and in a subfolder of your $HOME.
The entire idea of Openclaw (i.e., the core point of what distinguishes it from agents like Claude Code) is to give it access to your personal data, so it can act as your assistant.
If you only need a coding agent, Openclaw is the completely wrong tool. (As a side note, after using it for a few weeks, I'm not convinced it's the right tool for anything, but that's a different story.)
I fiddled with transferring the saved token from my keychain to the agent user keychain but it was not straightforward.
If someone knows how to get a subscription to Claude to work on another user via command line I’d love to know about it.
"Not a security mechanism. No mount isolation, no PID namespace, no credential separation. Linux documents it as not intended for sandboxing."
Escaping it is something that does not take too much effort. If you have ptrace, you can escape without privileges.
Anyway that's beside the point, which is that it doesn't have to "be malicious" to try to overcome what look like errors on its way to accomplishing the task you asked it to do.
It works well. Git rm is still allowed.
Nowadays I only run Claude in Plan mode, so it doesn’t ask me for permissions any more.
Are you confident it would still work against sophisticated prompt injection attacks that override your "strongly worded message"?
Strongly worded signs can be great for safety (actual mechanisms preventing undesirable actions from being taken are still much better), but are essentially meaningless for security.
So it’s deterministic based upon however the script it written
I do my best to keep off site back ups and don't worry about what I can't control.
Yes, I'm saying it's pretty much as bad as antivirus software.
> Are you sure that you haven't made some mistake in your dev box setup that would allow a hacker to compromise it?
Different category of error: Heuristically derived deterministic protection vs. protection based on a stochastic process.
> much more annoying to recover from that an accidental rm rf.
My point is that it's a different category, not that one is on average worse than the other. You don't want your security to just stand against the median attacker.
"env": { "CLAUDE_BASH_MAINTAIN_PROJECT_WORKING_DIR": "1" },
> Working directory persists across commands. Set CLAUDE_BASH_MAINTAIN_PROJECT_WORKING_DIR=1 to reset to the project directory after each command.
It reduces one problem - getting lost - but it trades it off for more complex commands on average since it has to specify the full path and/or `cd &&` most of the time.
[0] https://code.claude.com/docs/en/tools-reference#bash-tool-be...