undefined

points

[-]

But practically AI inference requires substantial local computing resources. It's not some web app, it's a order of magnitude more compute needed

by Zambyte22 hours ago|

parent|

[-]

Hopefully now you understand why people want smaller models.

by satvikpendem21 hours ago|

parent|

prev|

[-]

Not really, I run a production service on a basic server using these Gemma models, the server is weaker than my MacBook. Most people's laptops and even phones actually can run local models, most simply don't know how. Run Unsloth Studio and you'll see how easy it is.

As the sibling says this is why people want smaller but still performant models.

by 22 hours ago|

parent|

prev|

[-]

deleted