So if anything, we should find a way to aim for as little lines of code as possible. If you have two agents, and one can build exactly the same program as another, but with half the LoC, then most likely the first agent is better at software engineering and particularly software design.
Of course, as the author of an experiment that investigated exactly this, I'm slightly biased. Cursor's browser had millions lines of code which sounded weird to me based on the features and functionality it had. Meanwhile, I built the same thing but actually thinking about the design with the agent and ended up with ~20K lines of code instead.
(To state it in AI lingo:)
It's not about the best measure for "amount of code".
It's about wether "amount of code" is a good metric to begin with.