You don't. What they're saying is that today's small models (that fit on consumer hw) are better than yesteryear's top models. GPT4 was reportedly 8x 220B (~1.6T) MoE, and today you can run a 30-120B model that beats it handedly in real-world tasks.
Similarly for 4-20B models beating GPT3 (175B) and so on.
There is a sweetspot of "good enough" that the small models can reach, where you get equivalent tasks solved fully locally. They'll never touch SotA, but they'll reach 2-3-4 year's SotA. Which, depending on the task you need, it can be "good enough".