upvote
There's also lowering the number of experts you run in MoE models.
reply