upvote
That's great and all, but that's not the point I was making and you're engaging rather uncharitably on it. So when you view it from the perspective of capability increase it's rather impressive. Note the slope of progress which this experiment was to show.

Edit: Maybe uncharitably is too strong, but we're talking past each other.

reply
pron made this statement:

> It's 2026 and the idea that even with detailed-enough requirements you can one-shot even a workable (let alone perfect) solution also needs to die.

and brought up the failed anthropic experiment as proof of that. Yes, you are talking past each other, but that is not pron's fault. It is your fault.

reply
Eh fair enough!
reply
Saying the model failed to write a competitive C compiler makes more sense.

I don't think they tried to do that though.

> today's models are not yet able to produce production software without close supervision, even when uncharacteristically good specs and hand-written tests exist.

That's a good point anyway

reply
> Saying the model failed to write a competitive C compiler makes more sense.

Their compiler fails to compile (well, at least link) some C programs altogether, and in other cases it produces code that is 150,000x slower than a real C compiler with optimisations turned off (interestingly, the model trained on the real compiler's source code). That's not "not competitive" but "cannot be used in the real world". But even more importantly, the compiler cannot be fixed or evolved. It's bricked (at least as far as today's models' capabilities go). For any kind of software, not being able to improve or fix anything or add any new feature means it's effectively dead.

You could not use it in production even if no other C compiler existed.

reply
While I understand both points of view, I'm leaning towards yours, because:

- John Carmack embedded a C compiler and interpreter/runtime into Quake back in the mid 1990s as a scripting language! It was that efficient that it could be used in a real time 3D shooter. That's a solo effort as a minor component of a much larger piece of software.

- I've seen university CS courses hand out "implement a C compiler" as a homework / project exercise for students. It's not particularly difficult.

Sure, a modern C compiler like GCC has to handle inline assembly, various extensions, pragmas, intrinsics, etc... but like you said, all of those are thoroughly documented and have open source implementations to reference.

Similarly, the Rust compiler is implemented in Rust and could be used as an idiomatic reference for a generic compiler framework with input handling, parsing, intermediate representations, and so forth.

reply
> Their compiler fails to compile (well, at least link) some C programs altogether, and in other cases it produces code that is 150,000x slower than a real C compiler with optimisations turned off

I would bet that those things are also true of at least one expensive commercial C compiler.

reply
I'd love to hear of any currently available commerical C compiler which has that level of issues. I would bet you'll be hard pressed to find one. C compilation is a quite thoroughly solved problem. In any case please provide an example.
reply