Hacker News
new
past
comments
ask
show
jobs
points
by
lumost
7 hours ago
|
comments
by
irthomasthomas
5 hours ago
|
[-]
I dont think thats plausible because they also just launched a high-speed variant which presumably has the inference optimization and smaller batching and costs about 10x
also, if you have inference optimizations why not apply them to all models?
reply