> It’s still your responsibility to understand your system and define what “correctness” means, and you need a high-level understanding of temporal logic.
Using a TLC model checker to verify invariants or properties, finding a valid counterexample can scale non-elementarily.
In fact, in some situations it can end up being Ackermann-complete[0], but even recursively enumerable problems can be Tower-complete.
I would say that unless you focus on more detailed temporal logic pitfalls you may have issues.
So IMHO this will stay a use case specific solution, that you choose based on context.
Even a common solution like adding Circumscription causes counterintuitive changes [1][2].
IMHO, if you want to use TLA+ as a primary method, you will need some depth or be ready to abandon it by time boxing etc…
Remember that we know the open domain frame problem in [2] is equal to HALT, it will not universally apply.
It is just another tool that works well when it works well.
[0] https://arxiv.org/abs/2104.13866
Ran it on my tricky caching changes and it found bugs the standard agentic review flows missed on all frontier models. Good stuff.