It's a good demonstration of capabilities, sure, but the result itself makes no sense. We'll have to figure out where these capabilities can bring real advantage
I don't think that is the case here. Bun is pretty much using AI to write all of it's code, with a human reviewing it. Zig exists as a language to provide a nice DX over C and Rust, not to be memory safe. If you are using an LLM to generate code, the DX benefits are removed and so then why would you ever choose Zig over Rust?
The industry as a whole still is realizing that any LLM usage that actually writes all the code for you is causing cognitive debt, and we’re even slowly losing our skills of the art.
I’m trying my best to navigate this myself, but no matter what we do, using LLMs is both a blessing and a curse.
* I can agive you one quarter of amazing profits, if you let me dismantle and sell all the assets of a company.
* I can give you a few years of incredible food production, if you let me strip a rainforest and plant commercial crops.
* I can give you incredibly cheap energy, if you let me mine non renewing fossil fuels from the earth.
The context of why something is possible matters. In this case, because a very large and comprehensive test suite was seen as a necessity to specify a successful project (managed by humans). I do not believe a LLM coded project could ever have made such a test suite. In this case, the LLM is consuming the result of expensive human labor (the test suite) to make what ultimately is a minor variation to it (the implementation language).
Pocket calculator also can multiply numbers much faster than engineer, it doesn't make it engineer itself..
People want to use stuff like this as somehow evidence for AI being able to write entire software systems in a few days. We saw the same shit with the "compiler" they made with a bunch of agents. Literally the only reason it's possible is because the hundreds of thousands of man hours and God knows how much money that was poured into the reference projects befoes the AI got anywhere near it.
To replicate this kind of thing with a green field project would take an absolute ton of spec work and requirements derivation, which will substantially eat into any savings from having AI generate it.
The accomplishment itself is interesting, and unlocks opportunities to do work no one would have bothered with before, but it doesn't represent what a lot of people desperately want it to.
I am not sure why people sound so astounded, to be honest. This has been my frank experience of the agentic tools both Codex and Claude since about December.
When given the right constraints this kind of thing is entirely conceivable.
However the important question not being answered here is: does anybody working on it have a full understanding of what has been built?
My experience having constructed similar types of projects using these tools is yes, you could do this in a week or two but now you'll have a month or two of digging through what it made, understanding what was built, and undoing critical yolo leaps of faith it made that you didn't want.