upvote
Maybe you have seen NPU support via FLM already: https://lemonade-server.ai/flm_npu_linux.html

"FastFlowLM (FLM) support in Lemonade is in Early Access. FLM is free for non-commercial use, however note that commercial licensing terms apply. "

reply
The NPU works on Linux (Arch at least) on Strix Halo using FastFlowLM [1]. Their NPU kernels are proprietary though (free up to a reasonable amount of commercial revenue). It's neat you can run some models basically for free (using NPU instead of CPU/GPU), but the performance is underwhelming. The target for NPUs is really low power devices, and not useful if you have an APU/GPU like Strix Halo.

[1]: https://github.com/FastFlowLM/FastFlowLM

reply
I thought the NPU has been available since something like 6.12?
reply