undefined

upvote

points

by steno13222 hours ago |

upvote

by adam_arthur22 hours ago|

[-]

The entire universe of automation projects that can be run effectively for free relative to SoTA models?

I don't think many realize that most LLM embedded automation, pipelines, products will soon be able to run extremely cheaply on models < 100B parameters.

Frontier models will be used for coding/creation use cases, yes. But for all the pseudo-deterministic, pipeline, analysis style things there will be no practical benefit to running frontier models, only additional cost.

Gemma 4 26B outperforms most 100-200B models that I've tested for reasoning and structured output.

Gemma 4 12B can consistently select where to click on browser images given a minimal prompt, and do so very quickly.

reply

upvote

by dofm22 hours ago|

[-]

The 26B model is really surprising, and it is impressively concise — it spends a lot less time dithering than Qwen3.6.

reply

upvote

by steno13222 hours ago|

[-]

Practically if you're running a small personal automation project you're not going to want to waste a lot of time configuring and tuning a local model. You want to build the automation and move on.

If you're building a automation as a company you definitely won't want to take on the long term maintenance overhead of running your own models for some automation project.

reply

upvote

by adam_arthur22 hours ago|

[-]

These small models exist in the cloud and are/will be priced commensurately to their size.

Your claim is effectively that companies don't care about operational/cloud costs. Even pre-LLM, companies regularly assessed and tried to pare down cloud spend.

reply

upvote

by sowbug2 hours ago|

[-]

Whatever you're doing, try doing 500 or 1,000 of it in a batch. You'll exhaust any subscription quota you have, or if you're paying per token, you will probably find it too expensive. That's when you'll start to ask "how smart a model do I really need for this job?", and you'll investigate running a small but sufficiently capable model on your own PC, churning overnight through your 1,000 tasks.

reply

upvote

by mikeocool22 hours ago|

[-]

> I've been using Claude and GPT models for years

All 3 years?

reply

upvote

by steno13222 hours ago|

[-]

GPT1 was released in 2018, so yes, since then.

reply

upvote

by victorbjorklund3 hours ago|

[-]

GPT1 was way worse than small Gemma’s are now.

reply

upvote

by Zambyte22 hours ago|

[-]

I like using my computer.

reply

upvote

by steno13222 hours ago|

[-]

Exactly, thank you, we are on the same page! It's great to be able to use our own devices and not have their compute coopted by a third party.

I'd rather not have intensive compute needed shifted onto my personal machine which I want to use for something else.

reply

upvote

by satvikpendem22 hours ago|

[-]

By that logic, any software you run that isn't fully built by yourself is "third party" therefore you shouldn't run anything at all on your machine, thus obviating the need for it entirely.

reply

upvote

by steno13222 hours ago|

[-]

But practically AI inference requires substantial local computing resources. It's not some web app, it's a order of magnitude more compute needed

reply

upvote

by Zambyte22 hours ago|

[-]

Hopefully now you understand why people want smaller models.

reply

upvote

by satvikpendem21 hours ago|

[-]

Not really, I run a production service on a basic server using these Gemma models, the server is weaker than my MacBook. Most people's laptops and even phones actually can run local models, most simply don't know how. Run Unsloth Studio and you'll see how easy it is.

As the sibling says this is why people want smaller but still performant models.

reply

upvote

by 22 hours ago|

[-]

deleted

reply

upvote

by Zambyte22 hours ago|

[-]

I am not a "third party" on my own computer.

reply

upvote

by user272222 hours ago|

[-]

There is tinfoil.sh as well but honestly running this stuff on an airgapped server allows a better peace of mind about the data being used for something else.

reply

upvote

by steno13222 hours ago|

[-]

What's wrong with the data being used for something else? Someone is providing digital intelligence to us, saving us many hours a week, so the least we can do is provide them a little data so they are able to improve their service.

It would be selfish and unethical not to in my view. And ultimately the data is just being used in order to improve the models and benefit us, not for anything nefarious.

reply

upvote

by NicuCalcea20 hours ago|

[-]

If sharing our data is the least we can do, they shouldn't also ask us for our money. Otherwise, it's more than the least.

reply

upvote

by mannanj22 hours ago|

[-]

I don't like the gaslighting of paying Anthropic or Open(Closed)AI and it being said its unsustainable for them to take my payment while simultaneously they take my data (edit: which is incredibly valuable) and I cannot opt out of that.

The obsession is for leaving hostile and abusive entities, the corporations or the people who fund them that have a horrible track record in regards to ethicality, rights and respect & human dignity.

reply

upvote

by steno13222 hours ago|

[-]

My view is, if you're going to use the service - you should give the data.

It's like using Gmail and expecting them not to train their AI models on your data - how can you expect that when they're giving you a secure, reliable, highly functional email client completely for free?

The digital economy only works if everyone pays their fair share. If you don't want to give your data then you are really harming everyone by slowing down AI development for everyone else.

reply

upvote

by klardotsh22 hours ago|

[-]

Because we pay for the models.

If I pay you for a service, what implicit right should you have to then continue to profit in perpetuity by storing the data I paid you to process?

If LLMs were free your Gmail analogy might hold up. They aren’t, and so it doesn’t.

AI development can continue with the data folks opt into, or with the data AI companies incessantly scrape with reckless disregard for polite system loads. AI development does not require retaining all user inputs forever.

reply

upvote

by mannanj21 hours ago|

[-]

Apple is a good example of ethical services. They still give you privacy and ownership of your data, you keep your dignity and data. Google is a horrible model for this - it matches the whole thing about unethical, abusive, gaslighting relationships I described.

reply

upvote

by satvikpendem13 hours ago|

[-]

The same Apple that takes 30% of all transactions? In reality no corporation is in the public interest.

reply

upvote

by mannanj4 hours ago|

[-]

straw man. I'm talking about data.

reply

upvote

by mannanj21 hours ago|

[-]

However, you didn't actually get what I meant down, so you ended up inadvertently Straw Manning me.

My disinterest is in sharing my intellectual IP. Most people up to now, have never shared this much of their intellectual IP with a company. Name one product through human history before that got this much data and insight into human thinking and now can use your most intimate conversations, ideas and needs for non-training purposes?

You can't even opt out of that! At least for the training data you can opt-out.

reply

upvote

by satvikpendem13 hours ago|

[-]

Intellectual "property" is not real property. While I disagree with the parent on many things as my comments show, IP is not one of them. Information should be free, for anyone and everyone.

reply

upvote

by mannanj4 hours ago|

[-]

Another straw man.

"real" property or not. You agree that we have some right to our own outputs, right? Is that not dignity, to say "I want my outputs protected".

Seems like you think that your ideas should be free, as you called it information. How about you back that up with action... please send me all your most intimate, valuable ideas. Oh no, you don't feel comfortable? Then why are you sharing it with companies?

reply