undefined

points

[-]

I remember that time, where compiling Linux kernels was measured in hours. Then multi-core computing arrived, and after a few years it was down to 10 minutes.

With LLMs it feels more like the old punchcards, though.

by drowsspa1 days ago|

prev|

[-]

At least the compiler was free

by adrian_b1 days ago|

parent|

[-]

The point of doing local inference with huge models stored on an SSD is to do it free, even if slow.

by fireant10 hours ago|

parent|

[-]

You are just trading opex for capex. Local GPUs aren't free.

by adrian_b3 minutes ago|

parent|

[-]

True, but this is not only a trade-off between opex and capex.

Local inference using open weight models provides guaranteed performance which will remain stable over time, and be available at any moment.

As many current HN threads show, depending on external AI inference providers is extremely risky, as their performance can be degraded unpredictably at any time or their prices can be raised at any time, equally unpredictably.

Being dependent on a subscription for your programming workflow is a huge bet, that you will gain more from a slightly higher quality of the proprietary models than you will lose if the service will be degraded in the future.

As the recent history has shown, many have already lost this bet.