Dedicated inference ASICs are a dead end. You can't reprogram them, you can't finetune them, and they won't keep any of their resale value. Outside cruise missiles it's hard to imagine where such a disposable technology would be desirable.
For a 2.5 kW Server? I don't see it happening, your money and electricity is better spent on CUDA compute.