This is an RTX4080.
“In the more common situations of reducing PCI-e bandwidth to PCI-e 4.0 x8 from 4.0 x16, there was little change in content creation performance: There was only an average decrease in scores of 3% for Video Editing and motion graphics. In more extreme situations (such as running at 4.0 x4 / 3.0 x8), this changed to an average performance reduction of 10%.”
Still, 10% in difference is still considerable, almost gen-to-gen difference
…so what do you actually need PCIe for?
Thunderbolt is also too slow for higher-end networks. A single port is already insufficient for 100-gigabit speeds.
Apple recently added support for InfiniBand over Thunderbolt. And now almost all decent Mac Studio configurations have sold out. Those two may be connected.
TIL:
* https://developer.apple.com/documentation/technotes/tn3205-l...
Or maybe I forgot:
If you use dual-port NICs, you do not need a high-speed switch, which may be expensive, but you can connect directly the computers into a network, and configure them as either Ethernet bridges or IP routers.
I suppose that splitting an LLM workload is pretty sensitive to that.
Multi-GPU has recently experienced a resurgence due to the discovery of new workloads with broader appeal (LLMs), but that's too new to have significantly influenced hardware architectures, and LLM inference isn't the most natural thing to scale across many GPUs. Everybody's still competing with more or less the architectures they had on hand when LLMs arrived, with new low-precision matrix math units squeezed in wherever room can be made. It's not at all clear yet what the long-term outcome will be in terms of the balance between local vs cloud compute for inference, whether there will be any local training/fine-tuning at all, and which use cases are ultimately profitable in the long run. All of that influences whether it would be worthwhile for Apple to abandon their current client-first architecture that standardizes on a single integrated GPU and omits/rejects the complexity of multi-GPU setups.
I/O expansion
Networking