I don't write a lot of super low level stuff, so maybe things are different there, but at least in the normal user space level I've found it pretty rare that explicit mutexes ever beat the performance of an (in my opinion) easier design using queues and/or something like ZeroMQ.
Generally I've found that the penalty, even without contention, is pretty minimal, and it almost always wins under contention.