undefined

upvote

points

by e1g15 hours ago |

upvote

by atombender14 hours ago|

[-]

OP here. Sorry if this was premature. I came across it through your earlier comment on HN, started using it (as did a colleague), and we've been impressed enough with how efficient it is that I decided it deserved a post!

I've seen sandbox policy documents for agents before, but this is the first ready-to-use app I've come across.

I've only had a couple of points of friction so far:

- Files like .gitconfig and .gitignore in the home folder aren't accessible, and can't be made accessible without granting read only access to the home folder, I think?

- Process access is limited, so I can't ask Claude to run lldb or pkill or other commands that can help me debug local processes.

More fine-grained control would be really nice.

reply

upvote

by e1g13 hours ago|

[-]

Love the feedback -

For handling global rules (like ~/.gitconfig and ~/.gitignore), I keep a local policy file that whitelists my "shared globals" paths, and I tell Safehouse to include that policy by default. I just updated the README with an example that might be useful[1]. I also enabled access to ~/.gitignore by default as it's a common enough default.

For process management, there is a blurry line about how much to allow without undermining the sandboxing concept. I just added new integrations[2] to allow more process control and lldb, but I don't know this area well. You can try cloning the repo, asking your agents to tweak the rules in the repo until your use-case works, and send a PR - I'll merge it!

Alternatively, using the "custom policy" feature above, you can selectively grant broad access to your tools (you can use log monitoring to see rejections, and then add more permisions into the policy file)

[1] https://github.com/eugene1g/agent-safehouse?tab=readme-ov-fi...

[2] https://github.com/eugene1g/agent-safehouse/pull/7

reply

upvote

by atombender13 hours ago|

[-]

That is very useful. I wasn't sure if I could supply my own override list or how I would even format one, but this solves that problem!

The process control policy, that's kind of niche and should definitely not be something agents are always allowed to do, so having a shorthand flag like you added in that pull request is the right choice.

I'm sure Anthropic and the other major players will catch up and add better sandboxing eventually, but for now, this tool has been exactly what I needed — many thanks!

I also wonder if this could have be a plugin or MCP server? I was using this plugin [1] for a bit, and it appears to use a "PreToolUse" that modifies every tool invocation. The benefit here would be that you could even change the Safehouse settings inside a session, e.g. turn process control on or off.

[1] https://mksg.lu/blog/context-mode

reply

upvote

by indeyets3 hours ago|

[-]

Doesn’t that defeat the purpose? You want to control it from outside of the sandbox, not to give agent escape hatch from sandbox

reply

upvote

by atombender3 hours ago|

[-]

This would be slash commands that the agent itself wouldn't be able to do, and which would communicate with the plugin via a side channel the agent wouldn't know about. Admittedly I don't know much about the plugin interface in Claude Code, though.

reply

upvote

by bouke2 hours ago|

[-]

I've read through the agent investigation of Codex on macOS. It looks like the default sandbox is pretty limited, however it doesn't match my experience:

- I asked the agent to change my global git username, Codex asked my permission to execute `git config --global user.name "Botje"` and after I granted permission, it was able to change this global configuration.

- I asked it to list my home directory and it was able to (this time without Codex asking for permission).

reply

upvote

by TheBengaluruGuy14 hours ago|

[-]

I'm wondering if this could be adapted for openclaw. Running it in a machine that's accessible reduces friction and enables a lot of use-cases but equally hard to control/restrict it

reply

upvote

by ai_fry_ur_brain7 hours ago|

[-]

Just dont use openclaw, you dont need it.

reply

upvote

by asabla14 hours ago|

[-]

Oh woah!

I've been trying to get microsandbox to play nicely. But this is much closer to what I actually need.

I glimpsed through the site and the script. But couldn't really see any obvious gotchas.

Any you've found so far which hasn't been documented yet?

reply

upvote

by e1g14 hours ago|

[-]

Pure TUI is solid - I’ve been running all my pets inside that cage for several weeks with no issues. Auto-updates work, session renewals work, config updates work etc.

But lately I’ve been using agents to test via browsers, and starting headless browsers from the agent is flakey. I’m working on that but it’s hard to find a secure default to run Chrome.

In the repo, I have policies for running the Claude desktop app and VSCode inside the same sandbox (so can do yolo mode there too), so there is hope for sandboxing headless Chrome as well.

reply

upvote

by asabla14 hours ago|

[-]

Yee I gotcha.

Did a migration myself last week from using playwright mcp towards playwright-cli instead. Which has been playing much nicer so far. I guess you would run into the same issues you've already mentioned about running chrome headless in one of these sandboxes.

I'll for sure keep an eye out for updates.

Kudos to the project!

reply

upvote

by e1g11 hours ago|

[-]

playwright-cli works out of the box, and I just merged support for agent-browser. If you end up testing out Safehouse, and have any issues, just create an issue on GitHub, and I'll check it out. Browser usage is definitely among my use cases.

reply

upvote

by siwatanejo10 hours ago|

[-]

It's kinda funny that I, being skeptical about coding agents and their potential dangers, was interested to give your project a go because I don't trust AI.

Yet the first thing I find in your README is that to install your tool I need to trust some random server serve me an .sh file that I will execute in my computer (not sure if with sudo... but still).

Come on man, give me a tarball :)

EDIT: PS: before someone gives me the typical "but you could have malware in that tarball too!!!", well, it's easier to inspect what's inside the tarball and compare it to the sources of the repo, maybe also take a look at the CI of the repo to see if the tarball is really generated automatically from the contents of the repo ;)

reply

upvote

by e1g10 hours ago|

[-]

Fair! You don’t actually need to install anything and can just generate a text file with the security profile for sandbox-exec. You can do that online at https://agent-safehouse.dev/policy-builder.html

Alternatively, you can feed these instructions to your LLM and have it generate you a minimal policy file and a shell wrapper https://agent-safehouse.dev/llm-instructions.txt

reply

upvote

by camkego5 hours ago|

[-]

I think if the online builder could have been the whole project, that would be neat! Truly "zero-trust", what I think many HN readers want.

Anyway, thanks for building Agent Safehouse.

reply

upvote

by e1g5 hours ago|

[-]

That’s a great idea. I think I’ll restructure the entire project to be based around a collection of community managed rules, a UI generator to build a custom text file from those rules, and an LLM skill so people can evolve their policies themselves. The Bash script will remain in the background as one implementation, but shouldn’t be the only way.

reply

upvote

by oneplane10 hours ago|

[-]

That online builder is very cool, well done!

I've been trying out similar things to help internal teams to use systems and languages like Rego (for Open Policy Agent) to have a visual and more 'a la carte' experience when starting out, so they don't have to jump straight to learning all syntax and patterns for a language they might have never seen before.

reply

upvote

by e1g4 hours ago|

[-]

Thanks, Codex helped to put that together in like 20 minutes. Try feeding your agent the idea about an interactive config builder, give it the upstream URL with your condos, and see if it can whip up something for you.

reply

upvote

by chrisweekly12 minutes ago|

[-]

condos?

reply

upvote

by dummydummy12346 hours ago|

[-]

Really like the online builder!

reply

upvote

by Quiark10 hours ago|

[-]

Usually it takes less than 5 minutes to review the shell script that downloads stuff.

reply

upvote

by aa-jv2 hours ago|

[-]

Do you review every package in your package manager for back doors/trojans - or do you rely on the social circle upstream to do this work for you?

How is this any different than running some random .sh script?

The assumption is that package-manager code is reviewed - that same assumption can be applied just as equitably to wget'ed .sh files.

tl;dr - you are reviewing everything you ever run on your system, right?

reply

upvote

by quietsegfault10 hours ago|

[-]

What’s the difference between running natively and in a container, really?

reply

upvote

by cortesoft8 hours ago|

[-]

On Linux, not much. On a Mac, quite a bit.

reply

upvote

by quietsegfault8 hours ago|

[-]

Like mostly apple services such as iMessage? I’m asking honestly, not snarky! I don’t think performance is a big factor for agentic hyjinx.

reply

upvote

by scosman5 hours ago|

[-]

Apple APIs yes. But there’s also an overhead when running containers like docker on Mac (and windows). Only Linux has near-zero overhead.

reply

upvote

by quietsegfault1 hours ago|

[-]

Right, because on Mac (and windows) you’re running a VM rather than just setting up kernel namespaces. How cpu and network intensive are these pets? Or is it more of a principle thing, which I totally understand?

I prefer containerization because it gives me a repeatable environment that I know works, where on my system things can change as the os updates and applications evolve.

But I can understand the benefit of sandboxing for sure! Thank you.

reply

upvote

by sunnybeetroot6 hours ago|

[-]

Yes, anything Apple platform development

reply

upvote

by dionian9 hours ago|

[-]

i toyed around with policy builder for a few seconds, i was really impressed. great UX

reply