upvote
> Do you think it the models you’re using could be quantized more that they could be downloaded on first run using Background Assets?

I first tried the Qwen 3.5 0.8B Q4_K_S and the model couldn't hold a basic conversation. Although I haven't tried lower quants on 2B.

I'm also interested on the Apple Foundation models, and it's something I plan to try next. AFAIK it's on par with Qwen-3-4B [0]. The biggest upside as you alluded to is that you don't need to download it, which is huge for user onboarding.

[0] https://machinelearning.apple.com/research/apple-foundation-...

reply