undefined

[-]

Gemma and Llama can’t be bundled commercially, which sucks because they make two of the leading small llms. Qwen3 might be the last one with an Apache license.

by nolist_policy223 days ago|

[-]

You can bundle and use Gemma commercially[1].

[1] https://ai.google.dev/gemma/terms

by ivape223 days ago|

[-]

I’ll have to read that, thanks.

by coliveira225 days ago|

[-]

It's very convenient for Apple to do this: less expenses on costly AI chips, and more excuses to ask customers to buy their latest hardware.

by nine_k225 days ago|

[-]

Users have to pay for the compute somehow. Maybe by paying for models run in datacenters. Maybe paying for hardware that's capable enough to run models locally.

by Bootvis225 days ago|

[-]

I can upgrade to a bigger LLM I use through an API with one click. If it runs on my device device I need to buy a new phone.

by nine_k225 days ago|

[-]

I* can run the model on my device, no matter if I have an internet connection, nor if I have a permission from whoever controls the datacenter. I can run the model against highly private data while being certain that the private data never leaves my device.

It's a different set of trade-offs.

* Theoretically; I don't own an iPhone.

[-]

Well, unless it's open source, you can't be so certain. But more certain than when processing in the cloud, that's true.

by zamadatix223 days ago|

[-]

If iPhones were the efficient/smart way to pay for compute then Apple's datacenter would be built with those instead of servers.

by lostlogin225 days ago|

[-]

But also: if Apple's way works, it’s incredibly wasteful.

Server side means shared resources, shared upgrades and shared costs. The privacy aspect matters, but at what cost?

by shakna225 days ago|

[-]

Server side means an excuse to not improve model handling everywhere you can, and increasing global power usage by noticable percentage point, at a time when we're approaching "point of no return" with burning out the only planet we can live on.

The cost, so far, is greater.

by hu3225 days ago|

[-]

> Server side means an excuse to not improve model handling everywhere you can...

How so if efficiency is key for datacenters to be competitive? If anything it's the other way around.

by shakna224 days ago|

[0] https://interestingengineering.com/innovation/elon-musk-xai-...

[-]

Or, instead of improving efficiency, they go ahead and just deploy more generators [0]. Stop gap measures are cheaper.

[-]

Well, if it were easier to build power stations, they'd do so.

by coliveira225 days ago|

[-]

The previous commenter is right in that server-side companies have little incentive to do less, especially when they're backed by investors money. Client-side AI will be bound by device capabilities and customer investment in new devices.

[-]

How does running AI workloads on end user devices magically make them use less energy?

by thfuran224 days ago|

[-]

More like squinting to see if it's still visible in the rear view mirror.

by gessha224 days ago|

[-]

With the wave of enshitiffication that's surrounding everything tech or tech-adjacent, the privacy cost is pretty~ high.

by ivape224 days ago|

[-]

It takes about a $400 dollar graphics card to comfortably run something like a 3b-8b model. Comfortable as in fast inference, good sized context. 3b-5b models are what devices can somewhat fit. That means for us to get good running local models, we’d have to shrink one of those $400 dollar graphics cards down to a phone.

I don’t see this happening in the next 5 years.

The Mac mini being shrunk down to phone size is probably the better bet. We’d have to bring down the power consumption requirements too by a lot. Edge hardware is a ways off.

by nolist_policy223 days ago|

[-]

Gemma 3n E4B runs at 35tk/s prompt processing and 7-8 tk/s decode on my last last last gen flagship Android.

by ivape223 days ago|

[-]

I doubt this. What kind of t/s are you getting once your context window is reasonably saturated? Probably slows down to a crawl making it not good enough yet (the hardware that is).

by v5v3225 days ago|

[-]

With no company having a clear lead in everyday ai for the non technical mainstream user, there is only going to be a race to the bottom for subscription and API pricing.

Local doesn't cost the company anything, and increases the minimum hardware customers need to buy.