What it will do, is notice inconsistencies like a savant who can actually keep 12 layers of abstraction in mind at once. Tiny logic gaps with outsized impact, a typing mistake that will lead to data corruption downstream, a one variable change that complete changes your error handling semantics in a particular case, etc. It has been incredibly useful in my experience, it just serves a different purpose than a peer review.
i find it as a good backstop to catch dumb mistakes or suggest alternatives but is not a replacement for human review (we require human review but llm suggestions are always optional and you're free to ignore)