upvote
deleted
reply
The proposed solution seems to rely on a trusted compiler that generates the exact same output, bit-for-bit, as the compiler-under-test would generate if it was not compromised. That seems useful only in very narrow cases.
reply
You have a trusted compiler you write in assembly or even machine code. You then compile a source code you trust using that compiler. That is then used for the bit for bit analysis against a different binary of the compiler you produced to catch the hidden vulnerability.
reply
It's assumed that in this scenario you don't have access to a trusted compiler; if you do, then there's no problem.

And the thesis linked above seems to go beyond simply "use a trusted compiler to compile the next compiler". It involves deterministic compilation and comparing outputs, for example.

reply
I believe the issue is if an exploit is somehow injected into AI training data such that the AI unwittingly produces it and the human who requested the code doesn't even know.
reply
That’s a separate issue and specifically not what OP was describing. Also highly unlikely in practice unless you use a random LLM - the major LLM providers already have to deal with such things and they have decent techniques to deal with this problem afaik.
reply
If you think the the major LLM providers have a way to filter malicious (or even bad or wrong) code from training data, I have a bridge to sell you.
reply
No not to filter it out of training but to make it difficult to have poisoned data have an outsized impact vs its relative appearance. So yes, you can poison a random thing only you have ever mentioned. It’s more difficult to poison the LLM to inject subtle vulnerabilities you influence in code user’s ask it to generate. Unless you somehow flood training data with such vulnerabilities but that’s also more detectable.

TLDR: what I said is only foolish if you take the absolute dumbest possible interpretation of it.

reply