undefined

points

[-]

Also it's not just about running an obviously worse quant.

Running different GPU kernels / inference engines also matters. It's easy to write an implementation that is faster and thus cheaper but numerically much noisier / less accurate.

by jychang3 hours ago|

prev|

[-]

Yeah, the threat model is nonexistent. Most people use a dozen or so well known providers, who have no incentives to so obviously cheat.