undefined

points

by throwaway20277 hours ago|

[-]

Seconded. Gemini used to be trash and I used Claude and Codex a lot but gemini-3-flash-preview punches above it's weight, it's decent and I rarely if ever run into any token limit either.

by verdverm6 hours ago|

parent|

[-]

Thirded, I've been using gemini-3-flash to great effect. Anytime I have something more complicated, I give it to pro & flash to see what happens. Coin flip if flash is nearly equivalent (too many moving vars to be analytical at this point)

by PlatoIsADisease7 hours ago|

prev|

[-]

What models are you running locally? Just curious.

I am mostly restricted to 7-9B. I still like ancient early llama because its pretty unrestricted without having to use an abliteration.

by mark_l_watson6 hours ago|

parent|

[-]

I experimented with many models on my 16G and 32G Macs. For less memory, qwen3:4b is good, for the 32B Mac, gpt-oss:20b is good. I like the smaller Mistral models like mistral:v0.3 and rnj-1:latest is a pretty good small reasoning model.

by nurettin7 hours ago|

prev|

[-]

I like to ask claude how to prompt smaller models for the given task. With one prompt it was able to make a low quantized model call multiple functions via json.