upvote
I think that's because there's less local AI usage now since there's all kinds of image models by the big labs, so there's really no rush of people self hosting stable diffusion etc anymore

the space moved from Consumer to Enterprise pretty fast due to models getting bigger

reply
Today's free models are not really bigger when you account for the use of MoE (with ever increasing sparsity, meaning a smaller fraction of active parameters), and better ways of managing KV caching. You can do useful things with very little RAM/VRAM, it just gets slower and slower the more you try to squeeze it where it doesn't quite belong. But that's not a problem if you're willing to wait for every answer.
reply
yeah, but I mean more like the old setups where you'd just load a model on a 4090 or something, even with MoE it's a lot more complex and takes more VRAM, right? like it just seems not justifiable for most hobbyists

but maybe I'm just slightly out of the loop

reply
With sparse MoE it's worth running the experts in system RAM since that allows you to transparently use mmap and inactive experts can stay on disk. Of course that's also a slowdown unless you have enough RAM for the full set, but it lets you run much larger models on smaller systems.
reply
part of what discussion? anyone in the AI space knows and uses HF, but the public doesn't give a care and why should they? It's just an advanced site were nerds download AI stuff. HF is super valuable with their transformers library, their code, tutorials, smol-models, etc, but how does it translate to investor dollars?
reply
It isn't necessary to be part of the discussion if you are truly adding value (which HF continues to do). It's nice to see a company doing what it does best without constantly driving the hype train.
reply