undefined

points

by zbrock5 hours ago |

comments

by e12e1 hours ago|

[-]

Interesting write up!

Have you been able to extract libraries or tools from this project yet? If so how was that experience?

That is, do you see yourself releasing a metric harness, or sub-projects that are equivalent of ActiveRecord, zod, or similar open source tooling that frequently originate in a large in-house project - and then is exported out as a stand-alone toll, utility, library or framework?

Because while ai can reimplement minor tools, it's utility entirely depends on the existence of solid tools, libraries and frameworks.

by DenisM2 hours ago|

prev|

[-]

Fantastic job!

Can you share what type of project that was? On the spectrum from a database engine to cat picture sharing web site (very high demand for correctness vs very lax).

by aabdi5 hours ago|

prev|

[-]

Very cool article!

- are other teams adopting this approach? What’s the blockers if not?

- have there been problems where the models alone were not enough to debug and the devs had to fix it themselves?

- as the rate of changes has increased with more devs how have you dealt with concurrent writers with merge conflicts?

- if there was anything you could change in the approach you started with, what would it be?

by zbrock4 hours ago|

parent|

[-]

1. Yes! Many teams internally have adopted a lot of the same practices we outlined in the blog post. Ryan has also been spending time both internally and externally helping companies figure out how to do this in their code bases.

2. Hmm, kind of. There have definitely been issues the models can’t one shot. But we still use Codex to write all the actual code with human guidance.

3. More agents :) Some teams are experimenting with centralized Agent mediated integration queues, others use normal merge queues, many have local Codex threads that monitor CI to resolve and land conflicts or failures.

4. Today’s models and codex app. We started doing all this with gpt-5 and codex-cli. The tools today, 9 months later, are so much better than what we had then.

by HorizonXP4 hours ago|

parent|

[-]

Have you built any tooling or products around all of this and deploying it somehow? I’d love to learn more and share notes, because we’ve been doing this too. About 3100+ PRs merged across our 4 person team in 4 months. Impossible without harness engineering, and I agree, the tools are getting even better.

by pramodbiligiri2 hours ago|

prev|

[-]

Have you been satisfied with the quality of code generated by the model? Or did you have to tweak some rule file or skill to improve it? Or is human-readable code not even a goal at this point?

by s3p4 hours ago|

prev|

[-]

Were those em dashes you, or GPT