Keep your eye on Baidu's Ernie https://ernie.baidu.com/
Artificial analysis is generally on top of everything
https://artificialanalysis.ai/leaderboards/models
Those two are really the new players
Nanbeige which they haven't benchmarked just put out a shockingly good 3b model https://huggingface.co/Nanbeige - specifically https://huggingface.co/Nanbeige/Nanbeige4.1-3B
You have to tweak the hyper parameter like they say but I'm getting quality output, commensurate with maybe a 32b model, in exchange for a huge thinking lag
It's the new LFM 2.5
ollama create nanbeige-custom -f <(curl https://day50.dev/Nanbeige4.1-params.Modelfile)
That has the hyperparameters already in there. Then you can try it outIt's taking up like 2.5GB of ram.
my test query is always "compare rust and go with code samples". I'm telling you, the thinking token count is ... high...
Here's what I got https://day50.dev/rust_v_go.md
I just tried it on a 4gb raspberry pi and a 2012 era x230 with an i5-3210. Worked.
It'll take about 45 minutes on the pi which you know, isn't OOM...so there's that....