undefined

points

[-]

MLX is quite literally macOS-specific technology, for other platforms you want non-MLX.

I was sure "MLX" stood for "Metal-something-something" but can't find any reference to that somehow, anywho, "Metal" is hardware-accelerated graphics on Apple platforms FWIW.

Edit: about the actual release on Ollama, if you're on non-Apple hardware you probably want the NVFP4 variant ("gemma4:12b-nvfp4") which was uploaded 45 minutes ago, especially if you're with a recent nvidia GPU.

by Patrick_Devine22 hours ago|

parent|

[-]

I realize this is a little confusing; we're working w/ the MLX team to bring MLX to other platforms, but we're not quite there yet. The `gemma4:12b-nvfp4` model is specifically for the MLX engine.

For the GGUF 4bit variant (i.e. non-macs) you'll need `gemma4:12b-it-q4_K_M` which I just pushed. You'll also need to upgrade to version 0.30.4 which we're just about to release (it's in prerelease and we're running through our last regression tests).

by embedding-shape20 hours ago|

parent|

[-]

I gotta say, having both "gemma4:12b-mlx-bf16" and "gemma4:12b-nvfp4" be MLX-specific, and not labeling all of the MLX-specific ones as such, is a bit different than "little confusing" and more "set up to be confusing" :)

> You'll also need to upgrade to version 0.30.4 which we're just about to release

Interesting, wasn't Google coordinating today's release with you? Considering the blog post seems to have gone out way before the release even been cut.

by Patrick_Devine20 hours ago|

parent|

[-]

Given the model was just republished by Google 15 minutes ago and we're going to have to redo everything (and everyone will have to redownload for all platforms -- not just Ollama), I'll just say that sometimes things don't work out exactly the way you want them to. :-D

That said, I think the gemma4:12b-nvfp4 model is pretty solid. It's been tuned with Nvidia's model optimizer. I've been waiting on the results for MMLU-Pro, but I'll have to retrigger that after reconverting.

by embedding-shape19 hours ago|

parent|

[-]

> Given the model was just republished by Google 15 minutes ago

Hah, missed that! Guess that's slightly neat though, you get a second chance ;) NVFP4 been a blast to use across a wide range of models, seems to work really well, at least with vLLM and a nvidia card.

by spicySpy7 hours ago|

parent|

prev|

[-]

Would you mind to share the link to `gemma4:12b-it-q4_K_M`?

by sambaumann1 days ago|

parent|

prev|

[-]

I still get "this model requires macOS" when trying to pull that one

by embedding-shape1 days ago|

parent|

[-]

I don't use Ollama myself anymore, but seems others been having similar issues for quite some time, maybe one of these fit your environment exactly? https://github.com/ollama/ollama/issues?q=is%3Aissue%20state...

by jw12241 days ago|

prev|

[-]

MLX is Apple’s own machine learning framework, designed for Apple Silicon: https://opensource.apple.com/projects/mlx/

by Zambyte22 hours ago|

prev|

[-]

The non-MLX versions just dropped on Ollama. gemma4:12b-it-q8_0, gemma4:12b-it-bf16, etc.

by accountrequired1 days ago|

prev|

[-]

https://huggingface.co/ggml-org/gemma-4-12B-it-GGUF/tree/mai...

by jasonjmcghee1 days ago|

prev|

[-]

There's a CUDA backend for MLX now. Not sure about the maturity.