So not only do you have a closed loop system that has objective/automatic pass-fail criteria you also don't even have to supply the instructions about what the function is supposed to do or the test cases!
Obviously this isn't going to be 100% reliable (especially for edge cases) but you should be able to get an enormous speed up. And in many cases you should be able to supply the edge case tests and have the LLM fix it.
(Codex is still free for the next few days if you want to try their "High"/"Extra high" thinking models)