I agree with all the points that they list but I fear if I looked close at the code and how they did it I wouldn't stop cringing until I looked away. Frameworks like this tend to point out 10 concerns that you should be concerned about but aren't and make users learn a lot of new stuff to bend their work around your framework but they rarely get a clear understanding of what the concerns are, where exactly the value comes from the framework, etc.
That is, if you are trying to sell something you can do a lot better with something crazy and one-third-baked like OpenClaw, which will make your local Apple Store sell out of minis, than anything that rationally explains "you are going to have to invent all the stuff that is in this framework that looks like incomprehensible bloat to you right now." I mean, it is rational, it is true, but I can say empirically as a person-who-sells-things that it doesn't sell, in fact if you wanted me to make a magic charm that looks like it would sell things and make sure you don't sell anything it would be that.
Implementations are generally always going to be messy; and still I feel like not all the messiness is incidental. A lot of it is accidental :)
They themselves are turning into wrapper code for other libraries (e.g. the LLM abstraction which litellm handles for them).
Can also add:
Option 3: Use instructor + litellm (probabyly pydantic AI, but have not tried that yet)
Edit: As others pointed out their optimizing algorithms are very good (GEPA is great and let's you easily visualize / track the changes it makes to the prompt)