Worth adding that I had reasoning on for the Tiananmen question, so I could see the prep for the answer, and it had a pretty strong current of "This is a sensitive question to PRC authorities and I must not answer, or even hint at an answer". I'm not sure if a research tool would be sufficient to overcome that censorship, though I guess I'll find out!
Getting the local weather using a free API like met.no is a good first tool to use.
It needs to be just smart enough to use the tools and distill the responses into something usable. And one of the tools can be "ask claude/codex/gemini" so the local model itself doesn't actually need to do much.
That doesn't fix the "you don't know what you don't know" problem which is huge with smaller models. A bigger model with more world knowledge really is a lot smarter in practice, though at a huge cost in efficiency.
Is there already some research or experimentation done into this area?
Picking a model that's juuust smart enough to know it doesn't know is the key.
No. It runs on MacOS but uses Metal instead of MLX.
MLX is faster because it has better integration with Apple hardware. On the other hand GGUF is a far more popular format so there will be more programs and model variety.
So its kinda like having a very specific diet that you swear is better for you but you can only order food from a few restaurants.