It's easier when I have 10 simple problems as a part of one larger initiative/project. Think like "we had these 10 minor bugs/tweaks we wanted to make after a demo review". I can keep that straight. A bunch of agents working in parallel makes me notably faster there though actually reviewing all the output is still the bottleneck.
It's basically impossible when I'm working on multiple separate tasks that each require a lot of mental context. Two separate projects/products my team owns, two really hard technical problems, etc. This has been true before and after AI - big mental context switches are really expensive and people can't multitask despite how good we are at convincing ourselves we can.
I expect a lot of folks experience here depends heavily on how much of their work is the former vs the later. I also expect that there's a lot of feeling busy while not actually moving much faster.
Hey don’t say that too loudly, you’ll spook people.
With less snark, this is absolutely true for a lot of the use I’m seeing. It’s notably faster if you’re doing greenfield from scratch work though.
Also, I have noticed, strangely, that Claude is noticeably less compliant than GPT. If you ask a question, it will answer and then try to immediately make changes (which may not be related). If you say something isn't working, it will challenge you and it was tested (it wasn't). For a company that is seems to focus so much on ethics, they have produced an LLM that displays a clear disregard for users (perhaps that isn't a surprise). Either way, it is a very bad model for "agent swarm" style coding. I have been through this extensively but it will write bad code that doesn't work in a subtle way, it will tell that it works and that the issues relate to the way you are using the program, and then it will do the same thing five minutes later.
The tooling in this area is very good. The problem is that the AI cannot be trusted to write complex code. Imo, the future is something like Cerbaras Code that offers a speed up for single-threaded work. In most cases, I am just being lazy...I know what I want to write, I don't need the AI to do it, and I am seeing that I am faster if I just single-thread it.
Only counterpoint to this is that swarms are good for long-running admin, housekeeping, etc. Nowhere near what has been promised but not terrible.