Qwen specifically calls out FIM (“fill in the middle”) support on the model card and you can see it getting confused and posting the control tokens in the example here.
Sometimes they don't manage any tool calls and fall over off the bat, other times they manage a few tool calls and then start spewing nonsense. Some can manage sub agents fr a while then fall apart.. I just can't seem to get any consistently decent output on more 'consumer/home pc' type hardware. Mostly been using either pi or OpenCode for this testing.