undefined

points

[-]

Yes! I'd be totally happy with today's sonnet 4.6 if I could run it locally.

If you can forgive the obviously-AI-generated writing, [CPUs Aren't Dead](https://seqpu.com/CPUsArentDead) makes an interesting point on AI progress: Google's latest, smallest Gemma model (Gemma 4 E2B), which can run on a cell phone, outperforms GPT-3.5-turbo. Granted, this factoid is based on `MT-Bench` performance, a benchmark from 2023 which I assume to be both fully saturated and leaked into the training data for modern LLMs. However, cross-referencing [Artificial Analysis' Intelligence Index](https://artificialanalysis.ai/models?models=gemma-4-e2b-non-...) suggests that indeed the latest 2B open-weights models are capable of matching or beating 175B models from 3-4 years ago. Perhaps more impressive, [Gemma 4 E4B matches or beats GPT-4o](https://artificialanalysis.ai/models?models=gemma-4-e4b%2Cge...) on many benchmarks.

If this trend continues, perhaps we'll have the capabilities of today's best models available to reasonably run on our laptops!

by samuelknight1 hours ago|

prev|

[-]

The cost of intelligence is non-linear, with slightly dumber models costing much less. For a growing surface of problems you do not need frontier intelligence. You should use frontier intelligence for situations where you would otherwise require human intervention throughout the workflow, which is much more expensive than any model.

by minimaxir9 hours ago|

prev|

[-]

Many people were hoping that Sonnet 4.6 was "Opus 4.5 quality but with Sonnet speed/cost" but unfortunately that didn't pan out.

by malfist7 hours ago|

parent|

[-]

You can already see people here saying the same stuff about opus 4.7, saw a comment claiming that Opus 4.7 on low thinking was better than 4.6 on high.

I'm not seeing that in my testing, but these opinions are all vibe based anyway.

by Bridged77568 hours ago|

prev|

[-]

Efficiency doesn't make as much money. It's in big LLM's best interest to keep inference computationally expensive.

I personally think the whole "the newest model is crazy! You've gotta use X (insert most expensive model)" Is just FOMO and marketing-prone people just parroting whatever they've seen in the news or online.

by renticulous8 hours ago|

prev|

[-]

Does everyone need a graphing calculator? Does everyone need a scientific calculator? Does everyone need a normal calculator? Does everyone need GeoGebra or Desmos ?

by nprateem5 hours ago|

prev|

[-]

So you're happy with an untrustworthy lazy moron prone to stupid mistakes and guesswork?

Surely you can see the first lab that solves this gains a massive advantage?

by fkealy9 hours ago|

prev|

[-]

I agree, and yet here i am using it... However, I think the industry IS going multiple directions all at once with smaller models, bigger models etc. I need to try out Google's latest models but alas what can one person do in the face of so many new models...

by rambojohnson8 hours ago|

prev|

[-]

[dead]