upvote
Are you talking about hosted vs the ones you can easily run locally? Because there are open models that require hundreds of gb of vram which are apparently pretty close.
reply
on the Will It Mythos benchmark, small models are punching way above their weight(s)

gemma4-26B (#7)

qwen-3.6-27B (#9)

https://news.ycombinator.com/item?id=48640196

reply
I've tried running qwen 3.6 locally and it felt like LLMs a year ago where you can get them to do some stuff but the tasks have to be very small and you have to course correct them a lot to the point it's hard to say it's any faster than doing it all yourself.

Certainly the gap is closing but I feel it still makes more sense to pay pennies to run the full sized open models hosted on much better hardware.

reply
I had qwen36moe revamp my PhD thesis with a rewrite using JAX. Gave it access to my old code, helpednitnwhen it got stuck or didn't quite understand a few times.

Overall I was very impressed with its open box reimplementation. I remain of the mind they are widely underrated.

reply