upvote
DGX Spark is at the compute level of 5070. Its main issue is low memory bandwidth, i.e. it has quite fast token prefill but awful token generation. Strix Halo is just slow on every metric and used to be a cheap way to get 128GB unified RAM (now its prices are comparable to DGX Spark).
reply
I have one, this isn't true. The wattage of a 5070 is about 300. The spark entire unit runs at 200 watts max. In reality it runs like a rtx 5060 with lots of vram. Very good for training, okay for inferencing if you are running batch jobs and don't mind waiting.
reply
LLMs are memory bandwidth bound not compute bound.
reply
This is incorrect, prompt processing is compute bound.
reply
LLMs are bound by both and depends on the hardware which factor is higher.
reply
This is only true for some parts of the time cost function.
reply
I am working mostly with image models so we do a lot of fun times and the card fits perfectly here. Performance isn't great but it can just tug along in the background.àp
reply
I still not see the point running these models. I say they produce plausible garbage, nowhere near quality of frontier models (when they work).

Why can't Intel look beyond this nonsense state of affair and build something with 1TB of RAM or more?

What I am trying to say, I am yet to see anything competitive in the market. Cards very much stalled in sub 100GB region and best corporations can do is throw something to run toy models and forget about it after a week.

reply
What's wrong with Grace Hopper if you want to throw buckets of local memory at a problem?
reply
Most consumer platforms only allow up to 128/256GB of RAM. If you want more you likely need a data centre platform. This is again a mismatch between what companies think consumers are at and the reality.

I think e.g. AMD missed the boat with 9950x3d2 by limiting memory controller. If it was possible to hook it with 1TB of consumer DDR5 RAM, that would be something to write home about.

reply
Some people, including myself, loathe Nvidia with the fiery burning passion of a thousand suns, and will put up with whatever nonsense is necessary to run without them.
reply