undefined

points

[-]

Mistral have small variants (3B, 8B, 14B, etc.), as do others like IBM Granite and Qwen. Then there are finetunes based on these models, depending on your workflow/requirements.

by karmasimida28 minutes ago|

parent|

[-]

True, but anything remotely useful is 300B and above

by dust422 hours ago|

prev|

[-]

I am actually doing now a good part of dev with Qwen3-Coder-Next on an M1 64GB with Qwen Code CLI (a fork of Gemini CLI). I very much like

  a) to have an idea how much tokens I use and 
  b) be independent of VC financed token machines and 
  c) I can use it on a plane/train

Also I never have to wait in a queue, nor will I be told to wait for a few hours. And I get many answers in a second.

I don't do full vibe coding with a dozen agents though. I read all the code it produces and guide it where necessary.

Last not least, at some point the VC funded party will be over and when this happens one better knows how to be highly efficient in AI token use.