"The future" being "whenever training and inference at increased scale becomes economical". Which is probably bounded by new generations of hardware, but might also be pushed forward by algorithmic advances.
The likes of Mythos show that the scaling laws are real, and you can x5/x2 the total/active params and get meaningful gains. If "inference per param" gets cheaper? Up the params and get more intelligence for the same price.