undefined

points

[-]

I dream of having an LLM in a box over usb bought off AliExpress for a year and change now.

The LLM in a box is something you can buy today, but it 1. doesn’t serve over usb by default 2. costs $100k for hardware (not counting electricity) at 100 tps 3. can’t buy this from AliExpress.

Better to put that $100k in t-bills and just buy tokens even at api prices.

by rescbr4 hours ago|

parent|

[-]

I understand your point (and definitely want the same), but I do have an almost-AliExpress-LLM-in-a-box: it's an Thunderbolt eGPU dock (that I got from AliE, and it is USB-C...) with a RTX 4060 Ti with 16 GB of VRAM (bought locally for gaming before the price boom)

It's been awesome for embeddings and document OCR!

3D printing a case for it is on my todo list.

by IncreasePosts19 hours ago|

prev|

[-]

How will anyone running home instances be able to compete against people paying some money running much more powerful models on much more powerful hardware?

by Fr0styMatt8818 hours ago|

parent|

[-]

It’ll be interesting.

I’m using Qwen3.6:27B at home and mostly Sonnet/Opus (depending on the complexity of the task) at work.

You have to break things down into smaller chunks for the local models. For the bigger cloud ones they can do a lot of the broader thinking.

by fragmede11 hours ago|

parent|

[-]

Time is money, but apparently now thinking is money as well. How much is it going to cost to think harder? If it's, say, $10 to use a bigger cloud model, it becomes easier to qualify the cost of thinking.

by Bombthecat9 hours ago|

parent|

prev|

[-]

Yeah. There always will be a gab. And it will keep growing for the next years...

by jimbokun16 hours ago|

parent|

prev|

[-]

At some point it will be hard for us to tell the difference.