undefined

points

by a1studmuffin15 hours ago |

comments

by ds2df15 hours ago|

[-]

Apple could figure out a way to neatly package it into their ecosystem.

by winrid14 hours ago|

prev|

[-]

Not really. Most people won't self host.

by jonah14 hours ago|

parent|

[-]

The general public will self-host it's built in to your next phone or laptop straight out of the box or maybe from the App Store.

by delecti13 hours ago|

parent|

[-]

I agree that that's what it would take, but compute would need to get very cheap for it to be feasible to keep models running locally. That's an awful lot of memory to have just sitting with the model running in it.

by winrid13 hours ago|

parent|

prev|

[-]

True. I was thinking more of power users. Do you think Opus level capabilities will run on your average laptop in a year? I think that's pretty far away if ever.

by zozbot23413 hours ago|

parent|

[-]

You can demonstrate "running" the latest open Kimi or GLM model on a top-of-the-line laptop at very low throughput (Kimi at 2 tok/s, which is slow when you account for thinking time) today, courtesy of Flash-MoE with SSD weights offload. That's not Opus-like, it's not an "average" laptop and it's not really usable for non-niche purposes due to the low throughput. But it's impressive in a way, and it does give a nice idea of what might be feasible down the line.