upvote
So isolation is correct. Forking a sandbox gives you multiple exact duplicates of isolated environments.

When your coding agent has 10 ideas for what to do, to evaluate them correctly it needs to be able to evaluate them in isolation.

If you're building a website testing agent and halfway down a website, with a form half filled out a session ongoing, etc and it realizes it wants to test 2 things in isolation, forking is the only way.

We also envision this powering the next generation of devcycles "AI Agent, go try these 10 things and tell me which works best". AI forks the environment 10 times, gets 10 exact copies, does the thing in each of them, evaluates it, then takes the best option.

reply
Yep I can see this especially when the agent is spinning up test servers/smokes and you don't want those conflicting. How do we reconcile all the potential different git hashes though, upstream I guess etc (this might be an easy answer and I'm not super proficient with git so forgive)
reply
So we recommend branch per fork, merge what you like.

You have to change the branch on each fork individually currently and thats unlikely to change in the short term due to the complexity of git internals, but its not that hard to do yourself `git checkout -b fork-{whateverDiscriminator}`

reply
Have you considered git worktree?
reply
Agreed, the thing I'd be most interested in is the isolated execution environment you mentioned. Agents running autopilot are powerful. Agents running unsupervised on a machine with developer permissions and certificates where anything could influence the agent to act on an attacker's behalf is terrifying
reply
I recommend running the agent harness outside of the computer. The mental model I like to use is the computer is a tool the agent is using, and anything in the computer is untrusted.
reply
I would recommend not giving an agent the full run of any computing environment. Do handle fine grained internet access controls and credential injection like OpenShell does?
reply
I used to believe this, but I think the next generation of agents is much more autonomous and just needs a computer.

The work of a developer is open ended, so we use a computer for it. We don't try to box developers into small granular screwdrivers for each small thing.

Thats whats coming to all agents, they might want to run some analysis with python, want to generate a website/document in typescript, and might want to store data in markdown files or in MongoDB. I expect them to get much more autonomous and with that to end up just needing computers like us.

reply
The problem is the agent, which should be treated untrusted. The computer isn’t the problem
reply
Kind of. The chat logs of the agent are trustworthly, as should any telemetry you have on it or coming out of the VM. Its behavior should be treated as probabilistic and therefore untrustworthly.
reply