LLMs can write a lot of code. they can even write a comprehensive test suite for that code. However they can't tell you if it doesn't work because of some interaction with something else you didn't think about. They can't tell you that all race conditions are really fixed (despite being somewhat good at tracking them down when known). They can't tell you that the program doesn't work because it doesn't do something critical that nobody thought to write into the requirements until you noticed it was missing.