There was this ROCm bug I was watching for awhile: https://github.com/ROCm/ROCm/issues/5706 - This is about the GPU clock remaining at max frequency, but that can drive the fan speed to increase.
It doesn't happen with Vulkan backends, so that is what I have been using for my two dual R9700 hosts.
EDIT: The bug is closed but there were mentions of the issue still occurring after closure, so who knows if it is really fixed yet.
Yup, I suppose that these smaller, dense models are in the lead wrt. fast inference with consumer dGPUs (or iGPUs depending on total RAM) with just enough VRAM to contain the full model and context. That won't give you anywhere near SOTA results compared to larger MoE models with a similar amount of active parameters, but it will be quite fast.
I have 2x asrock R9700. One of the them was noticeably noisier than the other and eventually developed an annoying vibration while in the middle of its fan curve. Asrock replaced it under RMA.
How is your experience with dual cards? Is the a dense 27B model the best what you can run on this setup? What about other applications eg. diffusion or fine-tuning?