undefined

points

by VladVladikoff11 hours ago |

comments

by drujensen10 hours ago|

[-]

Exactly!

I installed nanoclaw to try to out.

What is kinda crazy is that any extension like discord connection is done using a skill.

A skill is a markdown file written in English to provide a step by step guide to an ai agent on how to do something.

Basically, the extensions are written by claude code on the fly. Every install of nanoclaw is custom written code.

There is nothing preventing the AI Agent from modifying the core nanoclaw engine.

It’s ironic that the article says “Don’t trust AI agents” but then uses skills and AI to write the core extensions of nanoclaw.

by jimminyx9 hours ago|

parent|

[-]

Author and creator of NanoClaw here.

I did my best to communicate this but I guess it was still missed:

NanoClaw is not software that you should run out of the box. It is designed as a sort of framework that gives a solid foundation for you to build your own custom version.

The idea is not that you toggle on a bunch of features and run it. You should customize, review, and make sure that the code does what you want.

So you should not trust the coding agents that they didn't break the security model while adding discord. But after discord is added, you review the code changes and verify that it's correct. And because even after adding discord you still only have 2-3k loc, it's actually something you can realistically do.

Additionally, the skills were originally a bit ad-hoc. Now they are full working, tested and reviewed reference implementations. Code is separate from markdown files. When adding a new integration or messaging channel, the agent uses `git merge` to merge the changes in, rather than rewriting from scratch. Adding the first channel is fully deterministic. The agent only resolves merge conflicts if there are any.

by solfox9 hours ago|

parent|

[-]

So, nanoclaw requires agents to code extensions on the fly to get to feature parity with openclaw… and you're celebrating nanoclaw having fewer LOC. How's the code smell after nanoclaw gets to feature parity?

by zahlman1 hours ago|

parent|

[-]

The point is to be able to choose the (presumably small) subset of features you actually want, and have a tractable review problem. Presumably people who really want openclaw would just use openclaw.

by MarkSweep9 hours ago|

parent|

prev|

[-]

Yeah, the article's claim of having a low number of lines of code are disingenuous. Rather than writing some sort of plugin interface, it has "skills" that are a combination of pre-written typescript and English language instructions for how to modify the codebase to include the feature. I don't see how self-modifying code that uses a RNG to generate changes is going to be better for security than a proper plugin system. And everyone who uses Nanoclaw will have a customized version of it, so any bugs reported on Nanoclaw probably have a high chance of being closed as "can't reproduce". Why would you live this way?

by sanex10 hours ago|

parent|

prev|

[-]

Yes and and they still have code examples in them so its not like it somehow doesn't count. Plus if you run the skill good luck bringing in changes from master later.

by bitwize7 hours ago|

parent|

prev|

[-]

> Basically, the extensions are written by claude code on the fly. Every install of nanoclaw is custom written code.

"Every copy of Nanoclaw is personalized." So if I use it long enough will I see the Wario apparition?

by gronky_11 hours ago|

prev|

[-]

Don’t know about other claws, with NanoClaw the agent can only rewrite code that runs inside the container.

You can see here that it’s only given write access to specific directories: https://github.com/qwibitai/nanoclaw/blob/8f91d3be576b830081...

by fvdessen9 hours ago|

prev|

[-]

I think the best place to put barriers in place is at the mcp / tool layer. The email inbox mcp should have guardrails to prevent damage. Those guardrails could be fine grained permissions, but could also be an adversarial model dedicated to prevent misuse.

by float411 hours ago|

prev|

[-]

Wouldn't you get >50% of the usefulness and 0% of the risk if you add read+draft permissions for the email connection through a proxy or oauth permissions? Then your claw can draft replies and you have to manually review+send. It's not a perfect PA that way, but could still be better than doing everything yourself for the vast majority of people who don't have a PA anyway?

It feels like, just like SWEs do with AI, we should treat the claw as an enthusiastic junior: let it do stuff, but always review before you merge (or in this case: send).

by jrecyclebin11 hours ago|

parent|

[-]

Agent can still "forgot password" on many accounts. Or magic link.

by 10 hours ago|

parent|

[-]

deleted

by coffeefirst10 hours ago|

prev|

[-]

Seriously. I don’t see any way to make any of this safe unless all it does is receive information and queue suggestions for the user.

But that’s not an agent, that’s a webhook.

Even without disk access, you can email the agent and tell it to forward all the incoming forgot password links.

[Edit: if anyone wants to downvote me that's your prerogative, but want to explain why I'm wrong?]

by msdz8 hours ago|

parent|

[-]

I agree, this is inherently unsafe. The two core security issues for agents, I’d say, are in LLMs not producing a “deterministic” outcome, and prompt injection.

Prompt injection is _probably_ solvable if something like [1] ever finds a mainstream implementation and adoption, but agents not being deterministic, as in “do not only what I’ve told you to do, but also how I meant it”, all while assuming perfect context retention, is a waaay bigger issue. If we ever were to have that, software development as a whole is solved outright, too.

[1] Google DeepMind: Defeating Prompt Injections by Design. https://arxiv.org/abs/2503.18813