upvote
I have the opposite experience where Gemini (even the flash models) has the only useful model for my reverse engineering related use case. My hunch is Google utilizes its free access to entire Google search indices to train itself from niche non-English speaking community websites, much frequently and in a "relevant" manner, which in the end gives these models the most up to date info for this particular kind of work. Every other model is just either 10 years outdated with their answers or simply hallucinates like waaaay crazy.
reply
3.1-pro is still very capable, and API is at competitive price vs e.g. Anthropic, they just can't seem to figure out RLHF and harness. It needs a lot of guiding, it tends to be lazy and poorly sticking to instructions by default.

It just feels like many google products really, they are capable of really amazing things, it's just that nobody there seem to care. I would guess they are likely optimizing more for internal use than their vast userbase.

reply
They optimize for making their SRE's lives easier, over quantizing models regardless of how negative an effect that has on the user.
reply
I just cancelled my Gemini subscription yesterday. I have a big private fork of OpenCode, and I did it the wrong way to start with, so I couldn't pull from upstream.

So I put together a plan for refactoring it, step by step, with tests, etc. After literally 8 solid days of fighting with Gemini 3 Pro, I still couldn't pull it off.

I gave GPT 5.5 a chance with the same prompt, plans, and repo. I'm not sure how long it took, but when I checked in on it a few hours later it was done. All tests passed, everything exactly how I'd asked, and better (it made some improvements).

reply
I never felt Gemini was ever better than the OpenAI or Anthropic. I think it’s more on par with open source models than the top 2
reply