Show HN: Tilde.run – Agent sandbox with a transactional, versioned filesystem

upvote

Show HN: Tilde.run – Agent sandbox with a transactional, versioned filesystem

(tilde.run)

167 points

by ozkatz19 hours ago |

upvote

by docheinestages18 hours ago|

[-]

Just my two cents: less is more and the first impression matters a lot. I'm saying this because we see a new agent sandbox tool on the front-page almost every day. Most of them have an AI-made landing page design, lots of animations, lots of words. This has become a bad sign for me. I can tell that you put time into it, made a video, and everything, but I guess I'm suffering from some kind of fatigue of having to go through all these tools. So, the less I have to process to get to the meat of exactly what I'm looking at, what sets this apart from others, why and when I would need to use it, then the more likely I am to actually engage with the product.

reply

upvote

by ozkatz18 hours ago|

[-]

That's fair. What makes this unique is the versioned, composable filesystem. It's built on top of lakeFS (https://github.com/treeverse/lakeFS) so it scales really well, unlike other solutions that try and do this with Git directly.

reply

upvote

by hamandcheese8 hours ago|

[-]

Is lakeFS an FS....? Zero mention of FUSE or a kernel module at all in the README.

reply

upvote

by rendaw5 hours ago|

[-]

The title says it's a new filesystem, you either need to use fuse or a kernel module.

reply

upvote

by doctorpangloss16 hours ago|

[-]

LLM authored comments are against the rules. I don't think file versioning is differentiated anyway.

reply

upvote

by nateb202212 hours ago|

[-]

OP is actually one of the co-creators of lakeFS, for context.

reply

upvote

by messh13 hours ago|

[-]

Sadly this is what sells. Standing out in this regard checkout https://shellbox.dev maybe swinging too far though?

reply

upvote

by dev3608 hours ago|

[-]

As someone who is building an AI tool in this category, can you give examples? :)

I've tried to focus more on end-user use-cases in my own product positioning, even though security is absolutely at the top of my list. This was hard to watch because it felt it demonstrated a security feature that is really secondary to the purpose of an agent.

What would be a spin in this AI category that would excite or surprise you?

reply

upvote

by debarshri3 hours ago|

[-]

Anthropic is probably looking at this trend and building something. When released will kill couple of startups.

reply

upvote

by whalesalad18 hours ago|

[-]

Agreed. All of these tools promise the world and are so incredibly vague. Actually show me what I can do with it, like hands on.

reply

upvote

by ozkatz18 hours ago|

[-]

https://www.youtube.com/watch?v=fDR8tmes020 - a 2 minute hands-on demo!

reply

upvote

by lifty3 hours ago|

[-]

I see a lot of negative feedback here, but I don't agree with it. This is really fantastic what you have built, especially for longer running agents that are used repeatedly, in which case the initial investment of giving only the permissions it needs is worth the effort. To that end, ability to combine several agents which have different roles, which are narrowly scoped in terms of permissions, would be a very useful feature. Perhaps you could even have an agent or UI overlay driven by AI, which can quickly scope the permissions for a new agent, so that users don't need to do it manually.

reply

upvote

by whalesalad15 hours ago|

[-]

Being brutally honest - terrible demo. 80% of this is baseline stuff, setting up permissions (annoying), and the last few seconds we see a file was deleted and we can approve it. This is not selling your product.

reply

upvote

by ozkatz10 hours ago|

[-]

Appreciate the honest feedback. I agree there's a lot to improve there.

reply

upvote

by jFriedensreich8 hours ago|

[-]

I had to dig hard to find this is a SAAS sandbox offering not an actual sandbox (the software i can use locally). Its just wasting peoples time, no one needs a non opensource sandbox. There are now at least 3 apache 2 projects (smolmachines, microsandbox, boxlite) working on sandboxes and at least one of them should be ready for primetime soon.

reply

upvote

by alexellisuk4 hours ago|

[-]

It's interesting to see this one launch (yes yet another sandbox.. I was getting worried we'd not seen one for a few days)

SlicerVM (est. 2022) is already used for prime time, not "free as in beer" but has pretty reasonable individual plans that include all features. Shares the core code with actuated. (Creator of both speaking here)

Feel free to take a look and see if gives you a little more than the others you mentioned. If not no problems, I realise some folks prefer free stuff.

reply

upvote

by jFriedensreich2 hours ago|

[-]

What do you mean "not free as in beer"? Its not free as in anything? Sandboxes need to be open source, nothing else is acceptable.

reply

upvote

by HatchedLake7213 hours ago|

[-]

It’s like saying no one needs Dropbox because rsync exists, or no one needs HubSpot because Salesforce exists.

reply

upvote

by grim_io2 hours ago|

[-]

No, it's like saying almost no one wants a saaszsh. Which is probably an accurate statement.

reply

upvote

by jFriedensreich2 hours ago|

[-]

Not really, its more like saying no one needs another windows when linux exists. By "no one needs" i mean the world needs open source sandbox building blocks that are up to the challenges of the current age, no closed source solution can be a fundamental building block for the world to become better and more secure. No non-local building block can be at the foundation to anything that makes the world better and more robust.

reply

upvote

by HatchedLake72158 minutes ago|

[-]

That's a very narrow and technical person's point of view.

You might need it open source, the majority of the world doesn't care, like they don't care Windows is closed source, or like AWS is a "cloud" running somewhere else. Both of them are building blocks that made "the world better and more robust".

reply

upvote

by skeledrew16 hours ago|

[-]

I made something pretty similar to this a couple months ago, when I was just getting into using coding agents. Has 2 parts that work individually but are better together: a change tracking FS and an agent sandbox. Haven't really used it though as it's a pain to get Claude Code working in that - Docker-based - sandbox without baking it in, and I really want something that's fully configurable. And then I didn't really need it to because I'm a very interactive user; I'm almost constantly watching the agent and never use YOLO... except for 1 codebase where it's frustratingly failing to fix a single particular bug and I really don't want to deal with it myself.

reply

upvote

by jmull16 hours ago|

[-]

This is an excellent idea who's time has come.

But this is too vague for me. I'm not seeing my questions answered in the landing page or FAQ either.

E.g.,... what's the pricing?

How does atomic commit really work? E.g., if one write to S3 succeeds but the update to a git repo fails?

Does this use optimistic locking or something else? What happens if I commit changes to a resource that was updated since it was imported?

Where/how is it hosted?

reply

upvote

by ozkatz16 hours ago|

[-]

Regarding pricing - that's indeed a great question and we don't have an answer yet. It will very likely be based on consumption and should be competitive to similar solutions.

Atomic commits are based on snapshotting done by lakeFS under the hood. Each sandbox run produces a new atomic commit to a hidden "main" branch. Updating that branch is optimistically concurrent, with lakeFS checking for conflicts - multiple writers updating the same object.

reply

upvote

by egorfine3 hours ago|

[-]

I glanced through the whole documentation, the homepage and the github readmes and still couldn't figure out which OS do they support and how. And this is especially important to know because sandboxing in macOS and Linux have nothing in common.

reply

upvote

by _pdp_15 hours ago|

[-]

Git is already versioned, S3 support versioning and any file copied into the sandbox, is well a copy, so I am not sure what is the angle here.

Other than that it looks cool!

reply

upvote

by gatvol13 hours ago|

[-]

Doesn't s3 now have versioning + POSIX mounts?

reply

upvote

by ozkatz10 hours ago|

[-]

S3 offers versioning at the single file level.

Imagine an agent dropping a directory with 1m images in it. just figuring out what happened and what got dropped, restoring it one by one, etc. - doable, but ergonomics are a bit lacking.

reply

upvote

by sudb13 hours ago|

[-]

Yep!

https://aws.amazon.com/blogs/aws/launching-s3-files-making-s...

reply

upvote

by _pdp_6 hours ago|

[-]

Thanks. This is actually interesting. The only downside is that it only works within AWS.

reply

upvote

by kushalpatil0718 hours ago|

[-]

I was trying to build an agent. None of the sandboxes out there had solved the filesystem problem. I want my agent to have a persistent storage, and that stays forever. Like a human with a computer. When the agent spins up again, it has access to the computer with the same files.

I had to create my own setup using aws s3 filesystem and docker for this.

Does Tilde solve for this?

reply

upvote

by theaniketmaurya1 hours ago|

[-]

This is something solved by a bunch of new sandboxes including ours - SmolVM

https://github.com/CelestoAI/smolVM

reply

upvote

by thepoet17 hours ago|

[-]

Hey, this is exactly what we do at https://instavm.io Agents get persistent storage that outlive the sandbox and when the agent spins up again you get access to the computer with same files.

reply

upvote

by Galanwe17 hours ago|

[-]

Snapshotting a filesystem is trivial with e.g. btrfs. You can hook snapshot creation in your agent.

That is a single one liner of btrfs subvolume snapshot, in a single hook configuration file, ready to be valued at $10B as quantum agentic versioned sandbox startup.

reply

upvote

by ozkatz16 hours ago|

[-]

Part of the appeal (subjective, I know) of versioning is stuff like human-in-the-loop approvals. Think of a pull request: a change is requested by an agent, a human approves, changes get merged atomically. Even if other changes were applied since creation.

reply

upvote

by gitaarik7 hours ago|

[-]

Isn't that like working on a codebase with an agent?

reply

upvote

by gavmor17 hours ago|

[-]

Nanoclaw mounts each agent's folder to the ephemeral container.

reply

upvote

by zuzululu17 hours ago|

[-]

just get a $5 VPS or hetzner and you are good.

reply

upvote

by keepamovin9 hours ago|

[-]

Just run it on your GitHub actions minutes

reply

upvote

by stronglikedan17 hours ago|

[-]

infosec would like a word...

reply

upvote

by zuzululu17 hours ago|

[-]

which is the bare minimum that I hope people are doing , nothing about trusting a third party is any less or more secure.

reply

upvote

by ozkatz18 hours ago|

[-]

Exactly that!

reply

upvote

by kindev2 hours ago|

[-]

Wow, I see a lot of potential with this project! Using the filesystem simplifies the integration with 3rd parties significantly.

reply

upvote

by seamossfet17 hours ago|

[-]

Does this provide gitflow to handle conflicts from multiple agents touching the same file system or is it purely for single-branch sequential iterations on the filesystem?

I have a use case that could use this if it supports handling branching and merging file systems.

reply

upvote

by ozkatz16 hours ago|

[-]

It uses lakeFS under the hood, so the unit of conflict would be a single file (object, under the hood). Resolving conflicts requires "picking" a winning side, or rerunning a conflicting job. Would you see a use case for merging changes into the same file? Interested to hear about your use case!

reply

upvote

by seamossfet16 hours ago|

[-]

We're building a CAD for drug design, we often have to handle large and highly varied file formats. Protein structures, compounds, python scripts, lab notebook entries, instrumentation data, etc.

From a data structure and file ergonomics perspective, think of it as similar to Unity or UE4 for drug design. We have a huge variety of assets to manage alongside their relationships to each other, and the project files are local on the user's machine (with a collaboration / sync over the network between scientists working on the same project, hence where something like this would come in for us).

Many of those files are fine with a winning side strategy, but some of them might not be that clean. Take a protein structure defined by an `mmcif` file for example, if we clean the file by removing hydrogen atoms and another scientist repairs a side chain on that same file then we'd need a way to reconcile those differences.

On the agent side, our agents will generate small python scripts that manipulate the proteins, then cache and re-use those scripts as tools when possible. So preserving those scripts alongside the mutated asset and conversation history is something we've been working on.

reply

upvote

by anonymousiam17 hours ago|

[-]

Back in the 1970's when versioned filesystems were invented, they provided a recovery path for when a file was improperly changed or deleted. Now, in the age of LLMs that go rouge, I can see why they would become popular again.

reply

upvote

by ozkatz16 hours ago|

[-]

Oh VMS, How I miss thee

reply

upvote

by sahil-shubham16 hours ago|

[-]

Nice work on the website!

Building something for the same problem but more so from the perspective of self-hostable stateful sandboxes, and not just the filesystem (see https://bhatti.sh). What sandbox solution are you using here?

reply

upvote

by arm3212 hours ago|

[-]

Your landing page looks very similar to OP's! I thought it was the same site!

reply

upvote

by alexellisuk4 hours ago|

[-]

Not a surprise at all.

If you look at https://slicervm.com you'll see he's copied our terminal animation from the top of the website. Took out a monthly subscription for 1x month, cloned the majority of the UX/DX and way the guest agent works.

Had people reach out and flag it to me and I'm like "yes there's a reason for that"..

I think this is just par for the course in an AI slop world. Nothing to stop people imitating, copying, cloning with a good prompt and partial source / detailed docs available.

reply

upvote

by sahil-shubham3 hours ago|

[-]

[dead]

reply

upvote

by cpard17 hours ago|

[-]

It was a nice surprise seeing your post on the first page of HN Oz, congrats!

If I understand correctly what Tilde is doing is extending the concept of the sandbox in an operating system - filesystem, to data too.

So this is a sandbox environment someone would use for data heavy agentic workloads, is this correct?

reply

upvote

by ozkatz16 hours ago|

[-]

Hey! It doesn't necessarily have to be "data heavy", but any form of state (from code to binary files) that an agent might use for automation.

Agents are really good at interacting with files and directories (text in, text out!). This adds a layer for those that allows managing that state in a transactional, versioned way.

reply

upvote

by digitaltrees18 hours ago|

[-]

Interesting project. I am building an IDE for my phone and browser (www.propelcode.app) and have evaluated a few container architectures and providers. It was quite painful to get a prototype working. I will try your platform and would be happy to give feedback.

reply

upvote

by ozkatz18 hours ago|

[-]

Much appreciated! and good luck with your project

reply

upvote

by digitaltrees18 hours ago|

[-]

What’s the best way to give you user feedback? What would be most helpful? What’s your ideal customer profile?

reply

upvote

by ozkatz18 hours ago|

[-]

oz dot katz at treeverse.io would be best. ICP is SMB/mid-sized ISVs.

reply

upvote

by mehmetkeremmtl15 hours ago|

[-]

The versioned filesystem is exactly what's missing when agents hallucinate and go off the rails. How fast are the rollbacks if an agent completely messes up the directory state?

reply

upvote

by ozkatz10 hours ago|

[-]

very very fast: proportional to the count of objects modified, but not their size. Every commit represents a snapshot - an immutable listing of objects that represents the repository. reverting is essentially applying the inverse of the diff introduced by the reverted commit.

This is metadata only as the objects themselves are immutable.

reply

upvote

by gitaarik7 hours ago|

[-]

Wasn't git invented for these kinds of things?

reply

upvote

by stronglikedan17 hours ago|

[-]

> Free to start

Before I invest my time into something like this I'll need to know what it'll end up costing in the end. Perhaps it's just that "private previews" aren't for me. Good luck!

reply

upvote

by mc-serious17 hours ago|

[-]

Nice, I think that's pretty neat. Do you have an idea where to take this further? I.e. for the filesystem it's great but what if you need to touch external systems that keep their own state?

reply

upvote

by ozkatz17 hours ago|

[-]

In a perfect world, every system and external API would expose a standardized interface for versioning its own immutable state, so you'd be able to rollback and time travel across multiple such systems.

Not sure what else we can do in this world other than tightly control outbound requests and provide enough visibility into those requests for a human|agent to try and undo changes.

Happy to hear your thoughts - what would you like to see us take this?

reply

upvote

by mc-serious16 hours ago|

[-]

Yeah tbh I think this might be close to impossible to do as it probably 1) requires alignment that every stateful system needs a rollback capablity 2) it needs to be standardized which will probably take a minimum of 2 years after consensus (and that's probably conservative).

I'd love to learn more on how egress can be handled securely in sandboxes, and in general also ingress as this has some security impact - as soon as you allow reading from an external system you open up a new threat vector. Curious to understand whether you have any strategy for network access?

reply

upvote

by ozkatz8 hours ago|

[-]

That’s the current RBAC implementation: agents by default can make no API calls. the only way for them to contact the outside world is through a forward proxy configured in the sandbox. that proxy only allows making requests to destinations explicitly allowed (host, path, method)

reply

upvote

by pwr118 hours ago|

[-]

This looks pretty useful. The versioned filesystem part is nice becuase that’s exactly where a lot of agent stuff gets messy fast.

reply

upvote

by zuzululu17 hours ago|

[-]

more tools I will never use or need theres just an endless supply of new open source projects now I stopped paying attention

I increasingly feel the impact of landing on the frontpage of HN is not as pronounced as it used to be. The demographic shift of HN is also noted, it has a lot more "reddit" vibe than I remember.

reply

upvote

by redlewel15 hours ago|

[-]

Before all the vibecoding when I saw some project even if I thought it was dumb or didn't appeal to me, there was still a level of respect for it because at least someone put the effort in to write the code and at least attempt to understand what they were doing. The more they understood they more they learned about programming even if the project itself isn't super useful for others.

Now I see these things and its more likely than not that it was spit out by an agentic tool with little to no understanding of the code, and hardly learning or effort took place. Feels cheap and a waste of time. Why spend my time looking at something that someone made in a few hours so they could up their fake portfolio?

Its great to find real development out there but these types of posts eg "Show: random agentic tool gibberish" feel cheap and flaccid now. Nothing impressive

reply

upvote

by trollbridge17 hours ago|

[-]

Kind of sad, because I can't think of anywhere that's replacing this.

reply

upvote

by Karrot_Kream16 hours ago|

[-]

tbh I think open internet forums are just dead. It was fun while it lasted but the reason it was good is because of the gatekeeping conditions (not to say that the gatekeeping didn't push away valuable contributors) that kept the internet forums hard to access.

GCs, blogs, and small chatrooms are the way.

reply

upvote

by zuzululu15 hours ago|

[-]

already on HN I am seeing a lot of generated or AI assisted comments. on Reddit, sometimes I will engage in a debate then it gets drawn out and I realize I am talking to a bot.

perhaps the biggest hit is the trust, now people will just jump to conclusion and say your comment is AI and overall the presence that I used to feel from before the AI days is not there.

its no longer rewarding and ironically i've started to engage a lot less and seek human connections outside so perhaps there is an upside.

I also see a lot of people cutting back on instagram and social media use. AI appears to be slowly driving people off the internet and towards analog real human connection but its very subtle and too little to celebrate

reply

upvote

by Karrot_Kream15 hours ago|

[-]

> I also see a lot of people cutting back on instagram and social media use. AI appears to be slowly driving people off the internet and towards analog real human connection but its very subtle and too little to celebrate

I think it was bound to happen. The open internet is like public infrastructure with no janitor. People rant on it, people lie on it, people push zealous activism on it, people send bots onto it. The amount of work it would take to effectively moderate this stuff wouldn't make it economically viable to run any site. You'd need a full time staff just to police this stuff.

Small groups are small enough to be moderated by everyone in the group. It might feel sad (it certainly feels sad to me), but I think we should realize we just happened to be on the internet in a weird moment where a high bar was needed to get onto it that happened to align around norms of good discussion. I'm struggling with this transition (because it's hard as an adult to find new places to socialize), but need to ween myself off this site because it's obvious the quality has dipped too low to get much out of it.

reply

upvote

by dandaka14 hours ago|

[-]

why can't we simply raise the bar for posting? I remember semi-open platforms, where you were invited, had to earn the right to post comments and posts. and you could easily lose those rights when downvoted. its seems strange in the AI-bot era that we allow any entity the freedom of speech.

reply

upvote

by Karrot_Kream13 hours ago|

[-]

That's essentially how most small chatrooms work these days. Join a bigger GC or small Discord/Matrix/IRC and bad behavior gets flagged with impunity. But most of the big web forums like HN, Reddit, etc predate that and moving to a model like that would pretty much kill the sites as we know them.

reply

upvote

by stronglikedan17 hours ago|

[-]

there's always been an endless supply of open source projects, but I think you'd be hard pressed to find an open source replacement for this project

reply

upvote

by verdverm17 hours ago|

[-]

There are dozens or hundreds of sandbox projects and companies now. It's the new vector database / agent memory until people notice OCI can do most of this and is already widely adopted in industry.

reply

upvote

by grim_io2 hours ago|

[-]

Another one.

If it's not a local sandbox, I'm not interested.

We've got enough subscription lock-in from LLM's already.

reply

upvote

by danielbenzvi19 hours ago|

[-]

Interesting. Their versioned storage sandbox seems to be what really sets them apart

reply

upvote

by qudat19 hours ago|

[-]

I don't get it, it looks like they are copying data to the sandbox filesystem why would that impact production data? Because the agent can re-upload the file to s3?

reply

upvote

by afshinmeh18 hours ago|

[-]

That's exactly how I tried to address that problem with https://github.com/afshinm/zerobox -- you control what network access (e.g. `--deny-net *.amazonaws.com`) your agent has and you also get snapshotting out of the box.

That said, using LakeFS is probably a better long term solution and I like this approach.

reply

upvote

by ozkatz19 hours ago|

[-]

Good question - the filesystem is Fuse-mounted into the sandbox, not copied into it. This way agents can modify data directly simply by interacting with the "local" files.

reply

upvote

by viewhub18 hours ago|

[-]

What compute resources does the sandbox have? Memory/CPU/GPU?

reply

upvote

by ozkatz18 hours ago|

[-]

Currently a static 2 cores and 4GB RAM, no GPU. Will be configurable soon!

reply

upvote

by viewhub12 hours ago|

[-]

Cool. I'll take the API for a spin in the next week. If I use it for my upcoming project, I'll need the ability to control the available CPU/GPU/Memory attached to each sandbox so I can right-size it for the workload. Congrats on the launch!

reply

upvote

by aussieguy12349 hours ago|

[-]

Nice project, but saying "Run AI agents in production without the risk" isn't quite accurate.

Even if some tool makes it impossible for an AI agent to delete things in a way that isn't recoverable, there are other risks such as data exfiltration that need to be managed separately.

reply

upvote

by clearstack17 hours ago|

[-]

If an agent deletes something important (e.g. database), can you undo it? Does it automatically backup before making changes?

reply

upvote

by ozkatz16 hours ago|

[-]

If that database is stored on the transactional filesystem available to the sandboxes, yes! Instead of backing up, it utilizes an efficient snapshot mechanism (lakeFS under the hood).

reply

upvote

by kay_o16 hours ago|

[-]

Does this interact with sql or only fs?

reply

upvote

by ozkatz16 hours ago|

[-]

It provides a filesystem abstraction, which agents are really good at interacting with. Because it's just a POSIX filesystem - you can put a sqlite database directly on it and get those same transactional capabilities for that too.

reply

upvote

by dtran2418 hours ago|

[-]

Do git and branching fit into this at all?

reply

upvote

by ozkatz18 hours ago|

[-]

Sure! and it's not either/or - you can either import code from GitHub (or any other git remote) into a Tilde repository, or simply clone a repository directly inside the sandbox if you want full control over the git commit/branch semantics.

reply

upvote

by mdavid62616 hours ago|

[-]

Just enable versioning in S3?

reply

upvote

by esafak18 hours ago|

[-]

I do not get it. If the agent is not mutating state the change can be checked in. If it is mutating external state, version control won't save you.

reply

upvote

by ozkatz18 hours ago|

[-]

the repo acts as a source of truth for agents. think memory, data & code. If an agent decides to change any of those, version control allows:

1. to have a human in the loop to approve certain changes 2. rollback changes that end up being incorrect 3. allow reviewing the timeline and history to figure out what changed and how

reply

upvote

by esafak18 hours ago|

[-]

2. is false. You can't roll back everything an agent does. If you told it to place a trade in the stock market, for example, you can not undo that. That is what I mean by external state. Everything else is covered by existing version control, is it not? What does this buy over that?

reply

upvote

by ozkatz18 hours ago|

[-]

indeed - this only applies to the filesystem managed by tilde. Existing version control is fine if you're only managing code. For data (Think large parquet files, millions json files, images and videos, etc), git doesn't scale well for that.

reply

upvote

by bossyTeacher18 hours ago|

[-]

Re 2: how do you rollback the (erroneous) action of removing a db table column and the subsequent data loss from the removed column?

reply

upvote

by dorianzheng18 hours ago|

[-]

any chance i can run local micro-VM such as boxlite with this?

reply

upvote

by ozkatz18 hours ago|

[-]

not at the moment. You can use lakeFS directly with Fuse-Mount to do something similar with your own compute.

reply

upvote

by dorianzheng18 hours ago|

[-]

got it, will definitely check it out do you have some performance number of lakeFS in your mind

reply

upvote

by irivkin17 hours ago|

[-]

Looks promising! I wanna try it!

reply

upvote

by whwhyb5 hours ago|

[-]

not to be confused with tilde.club

reply

upvote

by gverrilla16 hours ago|

[-]

I'm far from an expert on the field or in computer science, but from my limited perspective I don't see the need for sandboxing - after thousands of claude code interactions it never did nothing wrong that was serious, at all. If I understand this all correctly, lakeFS would be useful for versioning huge dataloads - but it's not my case: for my usecase I use dura and that's plenty, and for more serious projects where I want not only to version changes but also to 'journal' them, I use github. Also I don't understand one thing: this is like a different client? The website shows a screenshot of "Claude Code" that is not claude code at all, or is modified - that's not a terminal. Am I tripping in anything I said?

reply

upvote

by chickensong12 hours ago|

[-]

You're basically saying there's no need to wear a seatbelt because you've driven thousands of times without an accident. Claude is pretty well behaved, but it's not guaranteed to be safe, especially as you start to hit the gas and relinquish more control. Hope for the best, but plan for the worst and all that. Just because your use case doesn't need sandboxing, doesn't mean there's no need for sandboxing.

reply

upvote

by gverrilla10 hours ago|

[-]

I'm not having a debate because I'm quite ignorant of the subject. Just trying to learn from you: wouldn't recoverability and observability suffice instead of sandboxing, if such events are indeed rare? not necessarily for all usecases, but for most?

reply

upvote

by chickensong6 hours ago|

[-]

Yeah, I'm sure the reality is that a basic setup is fine for most casual development. The average user isn't concerned with security and we've basically normalized data breaches. If you have backups, use git, and manually approve Claude's access and actions, that's likely "good enough".

The problem is you start getting comfortable and tired of your workflow getting interrupted when the agent needs more/repeated access. Gradually the permission scope increases, or you decide to take the guards off completely. At this point you have a non-deterministic black box with internet access doing things to your computer. Maybe the agent gets confused and force-pushes git, maybe you load load a malicious plugin, or MCP to github and ingest something hostile. The internet isn't getting kinder, it's basically all-out war behind the scenes, and having your agent do online research is an attack vector. Security is layered, and sandboxing is a layer you can add to mitigate some issues and have piece of mind.

TBH I didn't look too closely at the featured product because I have my own solution already, but it sounds like a versioning filesystem is integrated, which can be really handy. Filesystem snapshots are fast and cheap compared to traditional backup/restore operations. Git is a nice layer for text files, but it's slow and not very good for binary stuff, so if you're working with images or 3d models etc, a versioning FS is really useful.

There are lots of agent use cases beyond individual coding. Maybe you're building a multi-tenant product that let's user agents do stuff and you need an undo feature. That's probably a good case for a sandbox with versioning FS. Maybe you have an agent handling contractual transactions that can't afford to oops. LLM agents are an entirely new computing interface, so we should imagine wide variety of use cases, some of which would likely benefit from a sandbox environment that versions data.

reply

upvote

by languid-photic15 hours ago|

[-]

Agreed. As alignment improves, I'm becoming increasingly bearish on sandboxing.

Version control and isolation will probably stay useful, though, more for distributed development and workflow reasons than for safety.

reply

upvote

by verdverm17 hours ago|

[-]

I implemented something like this in ADK with Dagger, but it misses some important features b/c of BuildKit underneath. The OCI foundations make saving each step as a layer, diff, clone/fork, and time travel easy. The hard parts are security and resource limits.

Glad to see more takes in this space.

reply

upvote

by redwood17 hours ago|

[-]

How does the scale? For example if I were to have hundreds or thousands of concurrent agents running with some parts of their data pulled out of shared state and other parts custom to that particular agent run and I wanted all of this to be preserved for future collective or individual agent use later, is this a reasonable primitive for that problem space? Or is this more for a situation what you have one or a small number of productivity assistance agents that need a sandbox but low data mutation throughput and low amount of concurrent access across different agents?

reply

upvote

by ozkatz16 hours ago|

[-]

it should absolutely scale to that. The filesystem is backed by lakeFS, where every sandbox automatically branches out, and mounts that branch. so you get isolation from lakeFS and the scale of an underlying object store (S3, in Tilde).

reply

upvote

by varispeed17 hours ago|

[-]

All these agent offering are missing a use case.

What I would use it for and why?

It reminds me of a blockchain - where it was a solution desperately looking for a problem. What problem does it solve?

reply

upvote

by wyre18 hours ago|

[-]

Interesting. Literally saw a tweet talking about exactly this last night.

Not sure how I feel about it using on your hosted service, while your home page is asking me for analytics data and only the cli and sdk are open source.

reply

upvote

by ozkatz18 hours ago|

[-]

Fair enough - the underlying technology is indeed open source (https://github.com/treeverse/lakeFS) - the service provides the hosting and tooling to make it easy for consumption by agents.

reply

upvote

by wyre18 hours ago|

[-]

Thats a cool project. I didn't scroll down far enough to see that. Thanks for the correction

I get providing a hosted service, but I don't understand how it makes it easier for agents to consume unless you're hosting an MCP? My understanding is an agent skill and a cli tool is all an agent needs?

reply

upvote

by ozkatz18 hours ago|

[-]

The repository itself get fuse-mounted into the running sandbox - no skill or MCP required to interact with data: an agent can simply `cat <file>` and use whatever tools they are already good at using.

reply

upvote

by foreman_6 hours ago|

[-]

[flagged]

reply

upvote

by kaant13 hours ago|

[-]

[dead]

reply

upvote

by Amber-chen11 hours ago|

[-]

[dead]

reply

upvote

by samashton1117 hours ago|

[-]

[flagged]

reply

upvote

by andrefelipeafos17 hours ago|

[-]

[dead]

reply

upvote

by nodeflare18 hours ago|

[-]

[flagged]

reply

upvote

by 17 hours ago|

[-]

deleted

reply

upvote

by cyanydeez18 hours ago|

[-]

I know everyones trying to figure out how to make money in this grift economy, but if you're a rational person, you know that it's all a bunch of gambling and tailoring your scope to b2b and ignoring local & open source models and tools, you're more likely going to be part of that permanent undeclass they keep talking about in a self-fullfilling prophecy.

reply

upvote

by yuppiepuppie18 hours ago|

[-]

What are you insinuating about this particular Show HN?

reply

upvote

by jrm418 hours ago|

[-]

Sir, this is just one piece of software.

reply