undefined

points

by vlovich1231 days ago |

comments

by 1 hours ago|

[-]

deleted

by BoppreH9 hours ago|

prev|

[-]

The proposed solution seems to rely on a trusted compiler that generates the exact same output, bit-for-bit, as the compiler-under-test would generate if it was not compromised. That seems useful only in very narrow cases.

by vlovich1238 hours ago|

parent|

[-]

You have a trusted compiler you write in assembly or even machine code. You then compile a source code you trust using that compiler. That is then used for the bit for bit analysis against a different binary of the compiler you produced to catch the hidden vulnerability.

by BoppreH7 hours ago|

parent|

[-]

It's assumed that in this scenario you don't have access to a trusted compiler; if you do, then there's no problem.

And the thesis linked above seems to go beyond simply "use a trusted compiler to compile the next compiler". It involves deterministic compilation and comparing outputs, for example.

by cozzyd1 days ago|

prev|

[-]

I believe the issue is if an exploit is somehow injected into AI training data such that the AI unwittingly produces it and the human who requested the code doesn't even know.

by vlovich1231 days ago|

parent|

[-]

That’s a separate issue and specifically not what OP was describing. Also highly unlikely in practice unless you use a random LLM - the major LLM providers already have to deal with such things and they have decent techniques to deal with this problem afaik.

by what17 hours ago|

parent|

[-]

If you think the the major LLM providers have a way to filter malicious (or even bad or wrong) code from training data, I have a bridge to sell you.

by vlovich12316 hours ago|

parent|

[-]

No not to filter it out of training but to make it difficult to have poisoned data have an outsized impact vs its relative appearance. So yes, you can poison a random thing only you have ever mentioned. It’s more difficult to poison the LLM to inject subtle vulnerabilities you influence in code user’s ask it to generate. Unless you somehow flood training data with such vulnerabilities but that’s also more detectable.

TLDR: what I said is only foolish if you take the absolute dumbest possible interpretation of it.