undefined

points

[-]

The big difference here is that the C-to-Go tool was presumably deterministic: running it over and over again should produce the exact same result. You can trust that result because the human wrote the conversion tool, understood it, tested it, and worked the bugs out.

The LLM is non-deterministic. You could have it independently do the conversion 10 times, and you'd get 10 different results, and some of them might even be wildly different. There's no way to validate that without reviewing it fully, in its entirety, each time.

That's not to say the human-written deterministic conversion tool is going to be perfect or infallible. But you can certainly build much more confidence with it than you can with the LLM.

by staticassertion45 minutes ago|

parent|

[-]

Why does the deterministic nature matter? The interesting part is having oracle tests, not determinism. If someone is deterministic and wrong you use oracle tests to catch that.

by kibwen22 minutes ago|

parent|

[-]

People keep saying "deterministic" when they mean "probabilistic". For illustration, a bloom filter is deterministic, but it's also probabilistic. LLMs are the same.

by 0xpgm2 hours ago|

parent|

prev|

[-]

Perhaps a viable approach might be to vibe code the translation tool itself and observe that for every input it gives the expected output. Then once the translation is done, the translation tool can be discarded.

This would require a robust test suite though.

One of the cases where vibe coding might actually be useful, writing a throwaway tool.

by _vertigo46 minutes ago|

parent|

[-]

I see this dilemma with LLMs all of the time.

Should you use the LLM to do the thing directly, or use the LLM to implement a tool that does the thing?

I tend to reach for the latter, it’s easier to reason about.