upvote
Should we not be counting function points rather than LOC’s.

Lines of Code is a meaningless measure. It should also be easy to count function points using AI.

reply
I'd argue LoC isn't actually a meaningless measure, but people use it the wrong way. The same program with the same features but less LoC is more likely to have a proper design and architecture, and is most likely easier to change and maintain in the future. Of course, only if it's less LoC because of proper design, not because you've folded everything to one line.

So if anything, we should find a way to aim for as little lines of code as possible. If you have two agents, and one can build exactly the same program as another, but with half the LoC, then most likely the first agent is better at software engineering and particularly software design.

Of course, as the author of an experiment that investigated exactly this, I'm slightly biased. Cursor's browser had millions lines of code which sounded weird to me based on the features and functionality it had. Meanwhile, I built the same thing but actually thinking about the design with the agent and ended up with ~20K lines of code instead.

reply
Sure; But that's not the point that is argued about here.

(To state it in AI lingo:)

It's not about the best measure for "amount of code".

It's about wether "amount of code" is a good metric to begin with.

reply
I don’t think it’s solvable. And I think Anthropic etc know it. LLMs can only reconstitute things in its training data and they are so hungry they can’t do a good job in long lived codebase full of complexity and novelty. There’s never going to be enough similar code on the open internet.
reply
> LLMs can only reconstitute things in its training data

Such as a 4D raytracing engine in Metal? Or integrating APIs for features first released months after their knowledge cut-off date?

LLMs have shown an ability to transfer "knowledge" and capabilities across domains, languages, and use-cases outside their training data.

Case in point: GPT-2 "learning" to translate English to French and vice versa despite non-English examples having been voluntarily (and almost entirely) removed from the dataset.

reply
Was this in the GPT2 paper?
reply
In "Language Models are Unsupervised Multitask Learners"[0]. Not sure whether it’s "the" GPT-2 paper.

3.7 Translation

> Performance on this task was surprising to us, since we deliberately removed non-English webpages from WebText as a filtering step. In order to con- firm this, we ran a byte-level language detector2 on WebText which detected only 10MB of data in the French language […]

[0]: https://cdn.openai.com/better-language-models/language_model...

reply