Yes, variants typically 2-3x less good...
Same with speculative decoding... They all do something, but there are known techniques that are substantially better - that just were't known when they started development of the previous models.
MTP will still be highly valuable for interactive use of course.