upvote
You’re quoting the batch pricing. On demand is 1.5 per input and 9 per M output. This is effectively comparable cost to Gemini 2.5 Pro in a flash tier model
reply
I think you have your pricing wrong there, Gemini 3.5 flash is $1.50 input and $9 output.
reply
Okay, it's kind of somewhere between haiku and sonnet level pricing, at somewhere between sonnet and opus level performance. Its a great option. I was hoping to see opus class intelligence at haiku level pricing out of google, and this is close to that!
reply
Never mind, after looking at more benchmarks, seems closer to sonnet level intelligence at slightly lower cost. Speed is great for latency sensitive applications, but if this was 1/2 the cost it would have been priced to win.

If this is the big model release out of google, its a disappointent.

reply
You are seeing batch inference, standard inference is $1.5/$9. I was excited until I saw that price.
reply
Standard pricing is showing for me as $1.50 / $9.

(I suspect you're viewing the "flex" pricing).

reply
Please delete/edit your AI-written and factually wrong post.
reply
In addition to people pointing out your LLM got the pricing wrong,

> The pricing and "including thinking tokens" framing position it as a reasoning-capable flash model rather than just a pure speed optimization

Every Gemini model starting with 2.5 has been a reasoning model.

reply