Certainly you could write specification for a piece of software, and the software could meet the specification while also leaking credentials. Obviously, that would be a problem. But at some point, this starts to feel artificial and silly. The same software could reformat your hard disk, right?
At some point, we aren’t discussing whether or not AI is doing a bad job writing software. We’re discussing whether or not it’s actively malicious.
Memory leaks, deleting the hard drive, spending money would all be observable behavior.
By your reasoning that the "observable behavior needs to be specified rigorously" it seems like you'd have to list these all out. We do, after all, already have cases of AI deleting data.
That sounds harder and more error prone than what we're doing now by rigorously defining these defects out of existence in code.