undefined

upvote

points

by skiing_crawling20 hours ago |

upvote

by gerdesj19 hours ago|

[-]

A nvidia spark thingie has 128GB unified RAM. They also have a dual port version of one of these things: https://www.nvidia.com/content/dam/en-zz/Solutions/networkin.... ie 2 x 100GB/s ports, they may even be 2 x 200GB/s. Once I've got my paws on one, I'll know more.

You can cluster these beasts too. Two and three (with two IP subnets) is fairly obvious. Four or more might need a switch depending on how much network latency affects things.

Apple seem to have forgotten about M series with gobs of RAM. I can't get the Apple shop to show more than 96GB of unified RAM and that costs a kidney.

reply

upvote

by mapontosevenths19 hours ago|

[-]

I have one, and I love it. That said my buddies Mac smokes it for inference workloads in terms of tokens per second AND its more usable for other things.

If you are training and doing research it's great, if you want to cluster them it cant be beat, but if you just want local inference on a single box buy a mac or even a strix halo device.

reply

upvote

by colinsane17 hours ago|

[-]

can those macs boot linux? i've heard about Asahi but have no idea how far along they are. i've got my fleet configured with nix and sure, nix can target darwin, but there's a _lot_ of sharp edges there: i don't really want to pull that thread unless i have to...

reply

upvote

by mapontosevenths17 hours ago|

[-]

I don't know. I think he just uses LMStudio most of the time on his, but that's one place I can say the spark really shines for me.

I'm a Linux guy, but also don't always have alot of time. The Spark comes out of the box with a nice Linux distro that's pre-configured to be easy to setup and the guides and online resources make getting up and running trivial, for even some complex tasks. You would have to do a LOT of tinkering just to figure out some of the things the nvidia resources walk you through natively. They have guides for a ton of stuff that include the optimal settings so you don't have to figure it all out through trial and error.

Check out these "playbooks" for some examples. [0] There's a lot to be said for not having to piece all that together yourself.

https://build.nvidia.com/spark

I think between unboxing mine setting it up to run headless, and generating tokens was like 20 minutes total for me.

reply

upvote

by theYipster2 hours ago|

[-]

Not the new ones. Only the M1 and M2 have good support for Asahi. But you really don't need it. If you need Linux, use a VM (UTM is free and is equivalent to KVM/QEMU in speed, despite being a Type-2 Hypervisor.)

reply

upvote

by Fizz4318 hours ago|

[-]

which mac is smoking the spark?

reply

upvote

by theYipster1 hours ago|

[-]

Mine, for one. M5 Max MacBook Pro 128GB with a 4TB SSD. $5100 after a $1000 discount at Microcenter. Great deal if you can find it in stock.

reply

upvote

by pmarreck18 hours ago|

[-]

pretty much any of them, dude, as long as you have enough RAM, since it uses unified RAM and a powerful SoC CPU/GPU. Literally any M-class model, but the M5 is currently top tier.

reply

upvote

by dannyw16 hours ago|

[-]

The DGX Spark has basically the same memory bandwidth as a M5 Pro, and far more than a M5.

Only the M3 Ultra really beats it, and once you start scoping out the cost of a M3 Ultra with 128GB or 256GB, the DGX Spark doesn’t look bad after all.

reply

upvote

by entrope6 hours ago|

[-]

> The DGX Spark has basically the same memory bandwidth as a M5 Pro, and far more than a M5.

I see ~274 GB/sec for the DGX Spark[1], versus 307 GB/sec for M5 Pro and 460 or 614 GB/sec for M5 Max[2]. One might call 90% "basically the same", but there are nominally two tiers above "Pro".

Yes, a MacBook Pro with 128 GB and M5 Max costs $5100 (14") or $5400 (16") versus currently $4700 for the DGX Spark, but the MBP includes keyboard, mouse, battery and portability. I believe its prefill is slower and you get 2 TB vs 4 TB SSD, but overall one gives up a lot to save 10% of the cost.

[1]- https://docs.nvidia.com/dgx/dgx-spark/hardware.html [2]- https://support.apple.com/en-us/126319

reply

upvote

by pmarreck6 hours ago|

[-]

I looked, but a sibling comment just provided the links. ~274 GB/sec for the DGX Spark, vs. 307 GB/sec for M5 Pro, and max 614 GB/sec (!!!) for M5 Max? Why would you completely friggin’ lie about this, or at minimum, not double-check your facts before bullshitting? Plus, you get a full-fledged computer along with it!

Apple could actually be a good deal and you folks would still make up something to not justify it. In a way, it’s amazing what Apple has accomplished- Baseless negatively-tainted perception in certain influential tech circles.

(To be fair, they’re kind of earning it. I’m glad Tim “Sweet T” Cook is departing.)

Plus, my original comment got downvoted despite being factually-correct. Thanks, Reddit. Oh, wait…

reply

upvote

by mapontosevenths17 hours ago|

[-]

Yep. Memory bandwidth is what decides how fast LLM's generate tokens (mostly). The DGX Spark has something like 270 GB/s of memory bandwidth, and the m5 ultra is ~615 GB/s. Theoretically DOUBLE the speed. In practice he only generates like 25% more tok/s, but that's still very impressive.

The spark can fine tune models in 1/4 the time and excels at other compute tasks in ways that Mac never can. Plus the high bandwidth ConnectX-7 ports would be like $1700 to buy on a card just for the network adapters... But for generating tokens, it just plain loses.

reply

upvote

by fsuts15 hours ago|

[-]

How noisy does his fan get…

reply

upvote

by pmarreck6 hours ago|

[-]

it doesn’t get noisy at all

reply

upvote

by justincormack9 hours ago|

[-]

It is 2x200Gb/s physically but the PCIe bandwidth is basically only 200Gb/s so it may as well be one, and actually its a weird 2xPCIe4 not 1xPCIe8 so it appears in software as dual 100Gb/s. Its a bit odd.

reply

upvote

by jauntywundrkind18 hours ago|

[-]

200 Gb / s (not GB/s)!

(Still potentially very useful! But not magically ultra fast.)

reply

upvote

by Computer019 hours ago|

[-]

128 gb of much slower ram than Apple.

reply

upvote

by dannyw16 hours ago|

[-]

DGX Spark is ~273GB/s. That’s about M5 Pro territory, and twice as fast as the M5. You’d have to go to the M5 Max, or M3 Ultra, to get higher memory bandwidth than the Spark.

reply

upvote

by hajile4 hours ago|

[-]

If you are trying to get more than 64gb of RAM or doing tons of inferencing, you're getting a Max or Ultra anyway.

reply