There are some perplexity comparison numbers for the previous gen - Orange pi 5 in link below.
Bit of a mixed bag, but doesn't seem catastrophic across the board. Some models are showing minimal perplexity loss at Q8...
https://github.com/invisiofficial/rk-llama.cpp/blob/rknpu2/g...
I couldn't imagine recommending any of these boards to people who aren't already SBC tinkerers.
Is this a thing? I read an article about how due to some implementation detail of GPUs, you don't actually get deterministic outputs even with temp 0.
But I don't understand that, and haven't experimented with it myself.
The main difference comes from rounding order of reduction difference.
It does make a small difference. Unless you have an unstable floating point algorithm, but if you have an unstable floating point algorithm on a GPU at low precision you were doomed from the start.