That said, I think the gemma4:12b-nvfp4 model is pretty solid. It's been tuned with Nvidia's model optimizer. I've been waiting on the results for MMLU-Pro, but I'll have to retrigger that after reconverting.
Hah, missed that! Guess that's slightly neat though, you get a second chance ;) NVFP4 been a blast to use across a wide range of models, seems to work really well, at least with vLLM and a nvidia card.