A few months ago I used Whisper from OpenAI, an automatic speech recognition system released in 2002, on my modern 20-core Intel CPU to convert audio from a video file to text. It worked fine. Took a while and the machine got hot and the fans kicked in. I then found the Intel's optimized version of whisper that used NPU. It required a lot more steps to get working, but in the end it did work and was about 6x faster. And the machine remained cool and silent in the process. Since then I have become a fan of the NPUs. They are not NVIDIA GeForce RTX 5090, but they are significantly better than a modern CPU.
A claude 4.6 they are most certainly not, but if you get through the janky AF software ecosystem they can run small LLMs reasonably well with basically zero CPU/GPU usage
I was under the impression that they were primarily designed for low power use.