upvote
What we really we need is some kind of more detailed spec language that doesn't have edge cases, where we describe exactly what we expect the generated code to do, and then formally verify that the now generated code matches the input spec requirement. It'd be super helpful to have something more formal with no ambiguity, especially because the english language tends to be pretty ambiguous in general which can result in spec problems

I also tend to find especially that there's a lot of cruft in human written spec languages - which makes them overly verbose once you really get into the details of how all of this works, so you could chop a lot of that out with a good spec language

I nominate that we call this completely novel, evolving discipline: 'programming'

reply
> What we really we need is some kind of more detailed spec language that doesn't have edge cases, where we describe exactly what we expect the generated code to do, and then formally verify that the now generated code matches the input spec requirement.

That's theorem provers and they're awful for anything of any reasonable complexity.

reply
Hate it all you want, but XML is genuinely a good fit there, and Claude is apparently insanely good at working with XML prompts.
reply
There are languages like Dafny that permit you to declare pre- and post-conditions for functions. Dafny in particular tries to automatically verify or disprove these claims with an SMT solver. It would be neat if LLMs could read a human-written contract and iterate on the implementation until it's provably correct. I imagine you'd have much higher confidence in the results using this technique, but I doubt that available models are trained appropriately for this use case.
reply
yes, am familiar with the "code is spec" trope.

Shame us all for moving away from something so perfect, precise, and that "doesn't have edge cases."

Hey - if you invent a programming language that can be used in such a way and create guaranteed deterministic behavior based on expressed desires as simple as natural language - ill pay a $200/m subscription for it.

reply
As people are discovering, natural language is insufficiently precise to be able to specify edge cases. Any language precise enough to be formally verified against is a programming language
reply
we're going to end up speaking past each other - but generally I do agree with you and am not denouncing the importance of formal verification methods. I do think abstractions are going to dominate the human ux above them
reply
XML will do it very well.
reply
> where we describe exactly what we expect the generated code to do, and then formally verify that the now generated code matches the input spec requirement.

In ancient times we had tech to do exactly that: Programming languages and tests.

reply
I called it gates on mine. I loved Beads but it closed tasks without any validation steps. Beads also had other weird issues, so I made my own alternative. I think "Gates" is also used by others projects that took on the same challenge I did in mine weirdly enough.

https://github.com/Giancarlos/guardrails

reply
We've been through that so many times. When UML arrived (and ALM tools suites, IBM was trying to sell it, Borland was trying to sell it, all those fancy and expensive StarTeam, Caliber and Together soft), then BPML and its friends arrived, Business Rule Management System (BRMS), Drools in Java world, etc.

It all failed. For a simple reason, popularized by Joel Spolsky: if you want to create specification that describes precisely what software is doing and how it is doing its job, then, well, you need to write that damn program using MS Word or Markdown, which is neither practical nor easy.

The new buzzword is "spec driven development", maybe it will work this time, but I would not bet on that right now.

BTW: when we will be at this point, it does not make sense anymore to generate code in programming languages we have today, LLM can simply generate binaries or at least some AST that will be directly translated to binary. In this way LISP would, eventually, take over the world!.

reply
I’ve been considering this as well, and trying to get my colleagues to understand and start doing it. I use it to pretty decent effect in my vibe coded slop side projects.

In the new world of mostly-AI code that is mostly not going to be properly reviewed or understood by humans, having a more and more robust manifestation and enforcement, and regeneration of the specs via the coding harness configuration combined with good old fashioned deterministic checks is one potential answer.

Taken to an extreme, the code doesn’t matter, it’s just another artifact generated by the specs, made manifest through the coding harness configuration and CI. If cost didn’t matter, you could re-generate code from scratch every time the specs/config change, and treat the specs/config as the new thing that you need to understand and maintain.

“Clean room code generation-compiler-thing.”

reply
[flagged]
reply
[dead]
reply