They are heavily bogged down by bandwidth unfortunately. The macs are on another level. If Apple decides to release AI dedicated hardware, it would dominate this space (consumer AI).
Ah yea after watching one of the creators youtube videos I realize these benchmarks are combining prefill and decode which isn't super helpful - it seems this struggles with the exact same bottlenecks as all strix halo setups, memory bandwidth. It seems this is still significantly slower than equivalent memory sizing on Mac hardware.
The apple silicon chips basically beat everything in bandwidth. Highest amount of memory controllers (i.e. channels) for a given capacity. That's the main party trick.