undefined

points

[-]

I don’t think it’s about literally shrinking the models via quantization, but rather training smaller/more efficient models from scratch

Smaller models have gotten much more powerful the last 2 years. Qwen 3.5 is one example of this. The cost/compute requirements of running the same level intelligence is going down

by root_axis8 hours ago|

parent|

[-]

There are no practically useful small models, including Qwen 3.5. Yes, the small models of today are a lot more interesting than the small models of 2 years ago, but they remain broadly incoherent beyond demos and tinkering.

by HerbManic17 hours ago|

parent|

prev|

[-]

I have said for a while that we need a sort of big-little-big model situation.

The inputs are parsed with a large LLM. This gets passed on to a smaller hyper specific model. That outputs to a large LLM to make it readable.

Essentially you can blend two model type. Probabilistic Input > Deterministic function > Probabilistic Output. Have multiple little determainistic models that are choose for specific tasks. Now all of this is VERY easy to say, and VERY difficult to do.

But if it could be done, it would basically shrink all the models needed. Don't need a huge input/output model if it is more of an interpreter.

by kyboren17 hours ago|

parent|

prev|

[-]

Yes, but bigger models are still more capable. Models shrinking (iso-performance) just means that people will train and use more capable models with a longer context.

by sipjca15 hours ago|

parent|

[-]

Of course they are! Both are important and will be around and used for different reasons