upvote
Especially if you consider those smaller models are really cheap and fast on platforms like openrouter. Often by the factor 100-500 cheaper than SOTA models, and 2-5x in TPS.
reply
Yeah took way too long to find that result. Being able to run on slow RAM isn't surprising considering you can run a model off an SSD.
reply
Right. You can also perform RSA encryption on pencil and paper with a scientific calculator. It works, but it's not useful throughput for serious work
reply
I was about to ask that
reply