I am willing to hear arguments for other approaches.
-mtune says "generate code that is optimised for this architecture" but it doesn't trigger arch specific features.
Whether these are right or not depends on what you are doing. If you are building gentoo on your laptop you should absolutely -mtune=native and -march=native. That's the whole point: you get the most optimised code you can for your hardware.
If you are shipping code for a wide variety of architectures and crucially the method of shipping is binary form then you want to think more about what you might want to support. You could do either: if you're shipping standard software pick a reasonable baseline (check what your distribution uses in its cflags). If however you're shipping compute-intensive software perhaps you load a shared object per CPU family or build your engine in place for best performance. The Intel compiler quite famously optimised per family, included all the copies in the output and selected the worst one on AMD ;) (https://medium.com/codex/fixing-intel-compilers-unfair-cpu-d...)
I’ve never heard of anyone doing that.
If you use a cloud provider and use a remote development environment (VSCode remote/Jetbrains Gateway) then you’re wrong: cloud providers swap out the CPUs without telling you and can sell newer CPUs at older prices if theres less demand for the newer CPUs; you can’t rely on that.
To take an old naming convention, even an E3-Xeon CPU is not equivalent to an E5 of the same generation. I’m willing to bet it mostly works but your claim “I build on the exact hardware I ship on” is much more strict.
The majority of people I know use either laptops or workstations with Xeon workstation or Threadripper CPUs— but when deployed it will be a Xeon scalable datacenter CPU or an Epyc.
Hell, I work in gamedev and we cross compile basically everything for consoles.
it certainly has scale issues when you need to support larger deployments.
[P.S.: the way I understand the words, "shipping" means "passing it off to someone else, likely across org boundaries" whereas what you're doing I'd call "deploying"]