There's a reason, for example, why the linux distros all target a generic x86 architecture rather than a specific architecture.
https://discourse.ubuntu.com/t/introducing-architecture-vari...
However, other applications which must do cryptographic operations, audio/video processing, scientific/technical/engineering computing, etc. may have wildly different performances when compiled for different x86-64 ISA versions, for which dedicated assembly-language functions exist.
Java, interestingly enough, is somewhat leading the way here with their Vector API. I think they actually have one of the better setups for allowing someone to write fast code that is platform independent.
C++ is also diving into this realm. 26 just merged in now SIMD instructions.
That is the bulk of the benefit of diving down into assembly.
Most of the applications whose performance matters for me, because I must wait a non-negligible time for them to do their job, are dependent on assembly implementation for certain functions invoked inside critical loops. I do not see any sign of replacements for them. On the contrary, Intel, AMD and Arm continue to introduce special instructions that are useful in certain niche applications and taking advantage of them will require additional assembly language functions, not less.
For me, there is only one application that I use and which consumes non-negligible computer time and which does not depend on SIMD optimizations, which is the compilation of software projects.