undefined

points

[-]

I do quite a lot of what this post describes in a reasonably large project. Here's what works for me:

- write gherkin features for new features; update them for enhancements; don't touch them for refactors. Label your PRs with these nouns.

- use pre-push hooks for type checks, linting, unit tests, and other quick, scriptable validations.

- make a viteperess subsite in your repo, have the agents maintain it - document important principles, architecture, etc.

- make a cli command which lists all pages along with the yaml frontmatter description so agents can choose what to read without blowing up the context window.

- use ddd and monorepo - write your logic in headless layers, and compose layers into apps. agents navigate layers very successfully.

- use zod (or your language equivalent) and contract-first API development; this is my favourite bit tbh, I use orpc

- make a single skill called "code" which describes the lifecycle: open a worktree, setup .env to guarantee no conflict with other agents (choose unused ports etc - docker is good here), write or update feature file (this is where you negotiate the spec), implement, validate (e.g. using playwright mcp), pre-push checks, push and wait for review, tear down and fast forward main

- testcontainers is great for ensuring multiple agents can run tests that don't conflict

Seriously I only have one skill that's it. Everything else is in the docs. I'm feeling very productive like this, in a "making good software" sense not a LoC sense.

by nullbio14 hours ago|

parent|

[-]

Can you share your skill please?

by pramodbiligiri12 hours ago|

parent|

[-]

I agree with many of the points made by nimonian above (esp the one starting with 'make a single skill called "code" which describes the lifecycle'), based on my limited experience with these things.

I'm building a skill + CLI tool along those lines (for solo devs not corporates). Here is what my "lifecycle" type skill looks like right now: https://github.com/bitkentech/shipsmooth/blob/releases/dist/... (warning, heavily work in progress). You can see a demo here: https://shipsmooth.net/

I was not happy with the default code quality generated by Claude Code. So I've been adding some skill-file rules to address that, and so far happy with the results: https://github.com/bitkentech/shipsmooth/tree/main/skills/ex.... There was a similar one on HN yesterday called opencodereview: https://news.ycombinator.com/item?id=48406358

There are many such workflows out there! Matt Pocock gave a good talk about how he approaches it: https://www.youtube.com/watch?v=-QFHIoCo-Ko

by rednb13 hours ago|

parent|

prev|

[-]

That's a big ask. This kind of harness usually contains plenty of proprietary insights about their business. And also, nowadays, a good harness is a major competitive advantage.

by nullbio11 hours ago|

parent|

[-]

Good thing I wasn't asking you.

Also, a skill is not a harness.

by rednb7 hours ago|

parent|

[-]

Your hostile tone is unfortunate, especially since my post was actually friendly. I was just trying to point why it is very likely the OP won't give you what you're asking so you're not left confused if he ends up ghosting you.

Many people use the term harness to refer to the agent coding software (eg. Opencode, Claude Code...), i use this term more broadly to refer to the environment (set of skills, system prompts, constraints, memory, hooks etc...). What the OP is referring to is not just one giant skill. It's usually a comprehensive ecosystem of skills, bespoke tools to make certain agent tasks deterministic (eg localization), and so on.

I've seen someone post Github repos in this thread, these can be very useful especially if you use the same tech stack, but you won't reach the level of productivity reported by successful teams unless you invest substantial time to build your own harness. But the way to do so is to do it progressively : start with something simple to address the need you have on day 1 . And then, turn recurring prompts into skills, turn recurring coding patterns and coding style recommendations into guidelines, turn repetivive tasks for which the LLM tends to build a python script that it occasionally gets wrong into a deterministic tool documented in a skill etc...

And after a couple of days, weeks, and months, you'll have a very dependable harness giving you optimal productivity, without needing to invest weeks of work upfront or take the fun out of agent-assisted coding.

Hope this helps.

by kmetan1 hours ago|

prev|

[-]

Instead of reading articles like this one end to end, I ask AI to read them in detail and prepare a new harness for me. The important part is not to do this in a single prompt, but to first create a detailed plan and let the model think deeply about each aspect. This approach lets me build the new harness without the missing didactic information you mentioned.

Basically, I am moving from “I build products without writing or reading the code” to “I build products without writing or reading the harness.”

Once the new implementation harness is prepared, I start it, but I keep the original session open. In that original session, “we” monitor the implementation harness from the outside: how effective it is, where the bottlenecks are, what breaks down, and what could be improved. From time to time, the monitoring session suggests changes to the implementation harness. We apply those changes, restart the harness, and monitor it again.

The overall approach is not to spend X hours understanding an article like this in detail, because another similar article will appear in 3 weeks. Instead, I take immediate action, learn on the fly, and replace the harness when a better pattern emerges. And yes, I still have to spend X hours on setting up, monitoring and fine tuning the new harness, but at the end I have the latest fancy "thing" working for me.

by drivebyhooting18 minutes ago|

parent|

[-]

I love this idea! Thanks

by gildas7 hours ago|

prev|

[-]

I have an example of a side-project [1] where I think I naturally applied the best practices described in this article. My goal was to see if it's possible to code an entire project using a single agent (Claude).

To do this, I "simply" asked the agent, every time it encountered an issue, how to resolve it, using a validation tool or script. I also asked it to code these tools during audits. As a result, I now have over 30+ rules [2] for validating their commits. It's working pretty well now.

[1] https://github.com/gildas-lormeau/rebuild-and-ruin (let the timer expire to see the "demo" mode)

[2] https://github.com/gildas-lormeau/rebuild-and-ruin/blob/a4c3...

by tchalla8 hours ago|

prev|

[-]

A lot to these blogposts are trying to catch on the next buzzword "harness". It's almost close to the productivity porn mindset that we witnessed 10-15 years ago where creating the complicated system is more exciting than using the system for daily tasks.

by bze1211 hours ago|

prev|

[-]

I agree. I followed this article for a repo I'm working on, and I had a very hard time inferring how, specifically, they implemented "providers" and enforced import layers. A sample repo would've been nice.