undefined

points

[-]

This is a really interesting idea. I wonder if something like Luau would be a good solution here - it's a typed version of Lua meant for sandboxing (built for Roblox scripting) that has a lot of guardrails on it.

https://luau.org/

by simonw8 hours ago|

prev|

[-]

Being unchanged for decades means that the training data should provide great results even for the smaller models.

by fragmede1 hours ago|

parent|

[-]

It means there's also plenty of bad examples in the training data to learn the wrong lessons from though.

by resonious3 hours ago|

prev|

[-]

I'll add that agents (CC/Codex) very often screw up escaping/quoting with their bash scripts and waste tokens figuring out what happened. It's worse when it's a script they save and re use because it's often a code injection vulnerability.

by fragmede1 hours ago|

parent|

[-]

I want them to be better at it, but given how hard it is for me as a human to get it right (which is to say, I get it wrong a lot, especially handling new lines in filenames, or filenames that start with --) I find it hard to fault them too much.

by JohnMakin7 hours ago|

prev|

[-]

They use bash in ways a human never would, and it seems very intuitive for them.

by Spivak5 hours ago|

parent|

[-]

If you present most LLM's with a run_python tool it won't realize that it can access a standard Linux userspace with it even if it's explicitly detailed. But spiritually the same tool called run_shell it will use correctly.

Gotta work with what's in the training data I suppose.

by 0x4574 hours ago|

parent|

[-]

There are a lot of shellscripts holding this world together out there.

by wild_egg8 hours ago|

prev|

[-]

Agents really do not care at all how "nice" a language is. You only need to be picky with language if a human is going to be working with the code. I get the impression that is not the use case here though

by bigbadfeline4 hours ago|

parent|

[-]

> Agents really do not care at all how "nice" a language is.

People do care.

> You only need to be picky with language if a human is going to be working with the code.

Sooner or later humans will have to work with the code - if only for their own self-preservation.

> I get the impression that is not the use case here though

If that's not the use case, there's no legitimate use case at all.

by fragmede40 minutes ago|

parent|

[-]

> Sooner or later humans will have to work with the code

We want that to be true, but it's starting to look like it might not be.

by Bolwin7 hours ago|

prev|

[-]

I've had LLMs write some pretty complex powershell on the fly. Still a shell language but a lot nicer.

Ideally something like nushell but they don't know that well

by inetknght8 hours ago|

prev|

[-]

Bash is ubiquitous and is not going away any time soon. Nothing is stopping you from doing the same thing with your favorite language.

by andrewingram7 hours ago|

prev|

[-]

just-bash comes with Python installed, so in a way that's what this has done. I've used this for some prototypes with AI tools (via bash-tool), can't really productionise it in our current setup, but it worked very well and was undeniably pretty cool.

by Leynos3 hours ago|

prev|

[-]

Codex has a JS REPL built in now. And pydantic have a minimal version of Python called Monty.

by sheept8 hours ago|

prev|

[-]

I feel like Deno would be perfect for this because it already has a permissions model enforced by the runtime

by tosh7 hours ago|

prev|

[-]

At least for me codex seems to write way more python than bash for general purpose stuff

by jauntywundrkind4 hours ago|

parent|

[-]

Agreed! Very notable codex behavior to prefer python for scripting purposes.

I keep telling myself to make a good zx skills or agents.md. I really like zx ergonomics & it's output when it shells out is friendly.

Top comments are lua. I respect it, and those look like neat tools. But please, not what I want to look at. It would be interesting to see how Lua fairs for scripting purposes though; I haven't done enough io to know what that would look like. Does it assume some uv wrapper too?

by pbowyer4 hours ago|

prev|

[-]

I came across a coding harness using Lua as its control plane yesterday: https://github.com/hsaliak/std_slop/blob/main/docs/lua_integ...

> std::slop is a persistent, SQLite-driven C++ CLI agent. It remembers your work through per-session ledgers, providing long-term recall, structured state management. std::slop features built-in Git integration. It's goal is to be an agent for which the context and its use fully transparent and configurable.

by westurner3 hours ago|

prev|

[-]

TIL about Monty. A number of people have tried to sandbox [python,] using python and user space; but ultimately they've all concluded that you can't sandbox python with python.

Virtual Machines are a better workload isolation boundary than Containers are a better workload isolation boundary than bubblewrap and a WASM runtime.

eWASM has costed opcodes; https://news.ycombinator.com/item?id=46825763

From "Show HN: CSL-Core – Formally Verified Neuro-Symbolic Safety Engine for AI" (2026) https://news.ycombinator.com/item?id=46963924 :

> Should a (formally verified) policy engine run within the same WASM runtime, or should it be enforced by the WASM runtime, or by the VM or Container that the WASM runtime runs within?

> "Show HN: Amla Sandbox – WASM bash shell sandbox for AI agents" (2026) https://news.ycombinator.com/item?id=46825026 re: eWASM and costed opcodes for agent efficiency

> How do these userspace policies compare to MAC and DAC implementations like SELinux AVC, AppArmor, Systemd SyscallFilter, and seccomp with containers for example?

> [ containers/bubblewrap#sandboxing , cloudflare/workerd, wasmtime-mte, ]

"Microsandbox: Virtual Machines that feel and perform like containers" https://news.ycombinator.com/item?id=44137501

microsandbox/microsandbox: https://github.com/microsandbox/microsandbox :

> opensource self-hosted sandboxes for ai agents

by snowhale4 hours ago|

prev|

[-]

[dead]