upvote
I think the story is a bit more complicated. Core succeeded precisely because Intel had both the low-power experience with Pentium-M and the high-power experience with Netburst. The P4 architecture told them a lot about what was and wasn't viable and at what complexity. When you look at the successor generations from Core, what you see are a lot of more complex P4-like features being re-added, but with the benefits of improved microarch and fab processes. Obviously we will never know, but I don't think you would get to Haswell or Skylake in the form they were without the learning experience of the P4.

In comparison, I think Arm is actually a very strong cautionary tale that focusing on power will not get you to performance. Arm processors remained pretty poor performance until designers from other CPU families entirely (PowerPC and Intel) took it on at Apple and basically dragged Arm to the performance level they are today.

reply
And not just any PowerPC architects either, but the people from PA Semi. Motorola couldn't get the speed up and IBM couldn't get the power down.
reply
I don’t have a micro architecture background so I apologize if this is obvious — What do power and speed mean in this context?
reply
Power - how many Watts does it need? Speed - how quickly can it perform operations?
reply
You can get low power with a simple design at a low clock. This definitely will not help achieve high performance later.
reply
Clock rate isn't the only factor. A design can be power hungry at a low clock rate if designed badly, and if it it is... you're never getting that think running fast.
reply
One could say "Optimize for efficiency first, then performance".
reply
Core evolved from the Banis (Centrino) CPU core which was based on P3, not P4. Banias used the front-side bus from P4 but not the cores.

Banias was hyper optimized for power, the mantra was to get done quickly and go to sleep to save power. Somewhere along the line someone said "hey what happens if we don't go to sleep?" and Core was born.

reply
Parallels to code design, where optimizing data or code size can end up having fantastic performance benefits (sometimes).
reply