upvote
You'd be surprised - I know I was - you can encode Test-Driven development into workflows that agents actually follow. I wrote an in-depth guide about this and have a POC for people to try over here: https://www.joegaebel.com/articles/principled-agentic-softwa...
reply
I use devcontainers in all the projects I use claude code on. [1] With it you can have claude running inside a container with just the project's code in write access and also mount a test folder with just read permissions, or do the opposite. You can even have both devcontainers and run them at the same time.

[1] https://code.claude.com/docs/en/devcontainer

If you want to try it just ask Claude to set it up for your project and review it after.

reply
1. Make tests 2. Commit them 3. Proceed with implementation and tell agent to use the tests but not modify them

It will probably comply, and at least if it does change the tests you can always revert those files to where you committed them

reply
Are there really no ways to control read/write permissions in a smart way? I've not had to do this yet, but is it really only capable of either being advisory with you implementing all the code, or it having full control over the repo where you just hope nothing important is changed?

You could probably make a system-level restriction so the software physically can't modify certain files, but I'm not sure how well that's going to fly if the program fails to edit it and there's no feedback of the failure.

reply
You can use a Claude PreToolUse command hook to prevent write (or even read) access to specific files.

With this approach you can enforce that Claude cannot access to specific files. It’s a guarantee and will always work, unlike a prompt or Claude.md which is just a suggestion that can be forgotten or ignored.

This post has an example hook for blocking access to sensitive files:

https://aiorg.dev/blog/claude-code-hooks#:~:text=Protect%20s...

reply
No. I don't want the mental burden of auditing whether it modified the tests.
reply
Then, run the agent vm-sandboxed, with tests mounted as a read-only directory, if your directory structure allows it.
reply
Or, less securely, hash the tests and check the hash with a hook, post tool use. Or a commit hook.
reply
Why can't you do just that? You can configure file path permissions in Claude or via an external tool.
reply
Why not use a client-server infrastructure for tests? The server sends the test code, the client runs the code, sends the output to the server and this replies pass/not pass.

One could even make zero-knowledge test development this way.

reply
deleted
reply
yeah i agree - this is somewhat the approach I have been using more of. Write the tests first based on specs and then write code to make the tests pass. This works well for cases where unit tests are sufficient.
reply
You can remove edit permissions on the test directory
reply
I'm not up to speed on Claude's features. Can I, from the prompt, quickly remove those permissions and then re-add them (i.e. one command to drop, and one command to re-add)?
reply
Yeah, you can type `/permisssions` and do it there. Or you can make a custom slash command, or just ask Claude to do it. You can also set it when you launch a claude session, there are a dozen ways to do anything.
reply
"Add a config option preventing you from modifying files matching src/*_test.py."
reply
Just tell it that the tests can't be changed. Honestly I'd be surprised if it tried to anyway. I've never had it do that through many projects where tests were provided to drive development.
reply