i am curious: is the performance gap between x86 cpu inference and apple silicon, or, a imho more apples-to-apples comparison, e.g., amd strixpoint halo vs apple silicon?
i would expect the "pure" cpu inference to be behind, but an approach like strix halo/dgx spark to be much closer?