undefined

points

[-]

The point would presumably be to formalise it, then verify that the formal version matches what you actually meant. At which point you can't/shouldn't regenerate it, but you can request changes (which you'd need to verify and approve).

by pron5 hours ago|

parent|

[-]

But the code produced from the formal spec would still be nondeterministic. And I believe CodeSpeak doesn't wish to regenerate the entire program with each spec change, but apply code changes based on the changes to the spec. Maybe there could be other benefits to formalisation in this case, but determinism isn't one of them.

by nyrikki2 hours ago|

parent|

[-]

Even with classic compilation, it is only the semantic behavior that is preserved.

What the Church–Rosser property/confluence is in term rewriting in lambda calculus is a possible lens.

To have a formally verified spec, one has to use some decidable fragment of FO.

If you try to replace code generation with rewriting things can get complicated fast.[2]

Rust uses affine types as an example and people try to add petri-nets[0] but in general petri-net reachability is Ackerman-complete [1]

It is just the trade off of using a context free like system like an LLM with natural language.

HoTT and how dependent types tend to break isomorphic ≃ equal Is another possible lens.

[0] https://arxiv.org/abs/2212.02754v3

[1] https://arxiv.org/abs/2212.02754v3

[2] https://arxiv.org/abs/2407.20822

by vidarh4 hours ago|

parent|

prev|

[-]

It doesn't matter if the code is different if the spec is formal enough to validate the software against it.

I have no idea about codespeak - I was responding to the comments above, not about codespeak.

by pron3 hours ago|

parent|

[-]

Validating programs against a formal spec is very, very hard for foundational computational complexity reasons. There's a reason why the largest programs whose code was fully verified against a formal spec, and at an enormous cost, were ~10KLOC. If you want to do it using proofs, then lines of proof outnumber lines of code 10-1000 to 1, and the work is far harder than for proofs in mathematics (that are typically much shorter). There are less absolute ways of checking spec conformance at some useful level of confidence, and they can be worthwhile, but they require expertise and care (I'm very much in favour of using them, but the thought that AI can "just" prove conformance to a formal spec ignores the computational complexity results in that field).

by vidarh3 hours ago|

parent|

[-]

For most cases we don't need nearly that comprehensive verification. This is expecting more off AI written code than we ever bother to subject most human written code to. There's a vast chasm there we only need to even slightly start to bridge to get to far higher confidence levels than the typical human dev team achieves.