You are missing a major detail: integrated GPUs are crap. They win on efficiency but not on raw compute. Before AI (and crypto too, I guess) people bought GPUs to render graphics and that was their main consumer. People built more and more demanding games that required increasingly powerful GPUs to render well. Gaming systems always had a discrete GPU so there was no reason to scale up integrated GPUs because they wouldn't sell, or they would be a waste of die space.
I don't think the M1 specifically focused on inference. Their goal was to replace Intel/AMD/Nvidia with their own chips, and since the previous Macs shipped discrete GPUs, they had to match or beat those so they don't ship something slower.