undefined

points

[-]

What... Are... You... Actually doing???

I am running opus to make changes to my code then running the code. I am genuinely curious how we are having such disparate experiences here. And at this point, IMO you're in too deep not to share...

Genuinely wondering if you're running gastown or some other crazy mixture of agents pretending they're an AI startup. I get by with a developer agent and a reviewer agent ping ponging off each other encouraged to be rude, crude, and socially unacceptable about it.

by MaxikCZ13 hours ago|

parent|

[-]

Replied to sibling

by fabbbbb14 hours ago|

prev|

[-]

How?! That must be like 10 parallel Opus 4.6 with all cranking lots of in and out respectively very long sessions?

by MaxikCZ13 hours ago|

parent|

[-]

Actually its just one opus aimed at a codebase with one goal, and instruction to spawn 2 subagents: one worker, that comes up with implementation plan, one validator that validates the proposed plan against my guardrails, and then return back to subagent1 to implement this, where the second subagent again tests the implementation.

One loop of this can take 20-60 min, and eat 2-5% of my week limit. I have to actively slow myself down to not burn trough more than 15-20% of my weekly limit in a day (as I also like to work on it on weekends)

Sadly I cant share the actual problem I am working on as its not my secret to disclose, but its nothing "crazy", and I am so surprised others dont have similar experience.

by LogicFailsMe13 hours ago|

parent|

[-]

Very similar to what I am doing. How big is the codebase? My biggest was about 250K LOC and the usual is about 10K LOC. I am really curious about figuring this out because I'm genuinely puzzled.

by manquer10 hours ago|

parent|

[-]

My code base is two monorepos 10M+ lines. I have the same experience as you - run 3-6 agents with remote devcontainers and tmux and rarely break the 75% usage, never had the weekly limit stop me.

My observations are these things impact both quality and token consumption a lot.

  - Architecture matters really- how messy code is and how poorly things are organized makes a big difference
 
  - how context window is managed especially now with default 1M window.
   
  - How many MCP servers are used. MCP burn a lot, CLI tools are easy , quicker and good ones don't even need any additional harness like skills etc, just prompt to suggest using them.

  - Using the right tool matters a lot 

  - What can be done with traditional deterministic tools have to be done that way with agent controlling (or even building) the tool not doing the tool's work with tokens. 

  - for large refactors codemod tools AST parsers etc are better than having the agent parse and modify every module/file or inefficiently navigate the codebase with grep and sed. 
    

  - How much prep work/planning is put in before the agents starts writing code. Earlier corrections are cheaper than later after code is generated

Typically my starting prompts for the plan phase are 1-2 pages worth and take 30-60m to even put in the cli text box. With that, first I would generate detailed ADRs, documentation and breakdown issues in the ticketing system using the agent, review and correct the plan several times before attempting to have a single line written.

---

It is no different as we would do with a human, typing the lines of code was always easy part once you know what you want exactly.

It feels faster to bypass the thinking phase and offload to agent entirely to either stumble around and getting low res feedback or worse just wing it, either way is just adding a lot of debt and things slow down quickly .