undefined

points

[-]

You'd be surprised - I know I was - you can encode Test-Driven development into workflows that agents actually follow. I wrote an in-depth guide about this and have a POC for people to try over here: https://www.joegaebel.com/articles/principled-agentic-softwa...

by simlevesque5 hours ago|

prev|

[-]

I use devcontainers in all the projects I use claude code on. [1] With it you can have claude running inside a container with just the project's code in write access and also mount a test folder with just read permissions, or do the opposite. You can even have both devcontainers and run them at the same time.

[1] https://code.claude.com/docs/en/devcontainer

If you want to try it just ask Claude to set it up for your project and review it after.

by comradesmith5 hours ago|

prev|

[-]

1. Make tests 2. Commit them 3. Proceed with implementation and tell agent to use the tests but not modify them

It will probably comply, and at least if it does change the tests you can always revert those files to where you committed them

by tavavex5 hours ago|

parent|

[-]

Are there really no ways to control read/write permissions in a smart way? I've not had to do this yet, but is it really only capable of either being advisory with you implementing all the code, or it having full control over the repo where you just hope nothing important is changed?

You could probably make a system-level restriction so the software physically can't modify certain files, but I'm not sure how well that's going to fly if the program fails to edit it and there's no feedback of the failure.

by mgrassotti5 hours ago|

parent|

[-]

You can use a Claude PreToolUse command hook to prevent write (or even read) access to specific files.

With this approach you can enforce that Claude cannot access to specific files. It’s a guarantee and will always work, unlike a prompt or Claude.md which is just a suggestion that can be forgotten or ignored.

This post has an example hook for blocking access to sensitive files:

https://aiorg.dev/blog/claude-code-hooks#:~:text=Protect%20s...

by BeetleB5 hours ago|

parent|

prev|

[-]

No. I don't want the mental burden of auditing whether it modified the tests.

by vitro5 hours ago|

parent|

[-]

Then, run the agent vm-sandboxed, with tests mounted as a read-only directory, if your directory structure allows it.

by jsw974 hours ago|

parent|

[-]

Or, less securely, hash the tests and check the hash with a hook, post tool use. Or a commit hook.

by paxys5 hours ago|

prev|

[-]

Why can't you do just that? You can configure file path permissions in Claude or via an external tool.

by pfortuny5 hours ago|

prev|

[-]

Why not use a client-server infrastructure for tests? The server sends the test code, the client runs the code, sends the output to the server and this replies pass/not pass.

One could even make zero-knowledge test development this way.

by 4 hours ago|

prev|

[-]

deleted

by aray075 hours ago|

prev|

[-]

yeah i agree - this is somewhat the approach I have been using more of. Write the tests first based on specs and then write code to make the tests pass. This works well for cases where unit tests are sufficient.

by SatvikBeri5 hours ago|

prev|

[-]

You can remove edit permissions on the test directory

by BeetleB5 hours ago|

parent|

[-]

I'm not up to speed on Claude's features. Can I, from the prompt, quickly remove those permissions and then re-add them (i.e. one command to drop, and one command to re-add)?

by SatvikBeri4 hours ago|

parent|

[-]

Yeah, you can type `/permisssions` and do it there. Or you can make a custom slash command, or just ask Claude to do it. You can also set it when you launch a claude session, there are a dozen ways to do anything.

by kubb5 hours ago|

prev|

[-]

"Add a config option preventing you from modifying files matching src/*_test.py."

by dboreham5 hours ago|

prev|

[-]

Just tell it that the tests can't be changed. Honestly I'd be surprised if it tried to anyway. I've never had it do that through many projects where tests were provided to drive development.