upvote
Hopefully that is the case and hardware does not get impossible to get.
reply
yes, heavily constrained by sophisticated guardrails.

this is definitely where things are going. the enormous "eat the world" models have extreme diminishing returns by comparison.

reply
You clearly didn't read the recent speculative decoding papers because it's been possible to use any model to speculate for any other model for awhile. They solved the tokenization problems that prevented this in the past.
reply