The framework's whole deal is that it lets you use the same API to target either the device built-in models, the Apple-hosted online models (Private Cloud Computer), or write your own shims to call out to arbitrarily hosted online models.
You can then dynamically route your calls to a different kind of model/provider, using system APIs, without having to write your own abstraction layer over "I want to use local model for this, but I want to use Claude for that", or having to integrate your own API integration with Anthropic/OpenAI APIs.
It abstracts things like tool calling in one place; and has a bunch of other niceties/oddities (it keeps the same "transcript" going, even if you dynamically switch providers/models during a session) and some other things.
Lol bro this is literally it this is the model they've been training (was Apple Foundation model not a big enough hint?)