upvote
You can really notice the tool use problems. They gotta get on that. The agent trend seems real, and powerful. They can't afford to fall behind on it.
reply
I don't really have tool usage issues that I don't put under that doesn't follow system prompt instructions consistently

there are these times where it puts a prefix on all function calls, which is weird and I think hallucination, so maybe that one

3.1 hopefully fixes that

reply
"They can't afford to fall behind on it."

They are very, very seriously far behind as of 3.0.

We'll see if 3.1 addresses the issue at all.

reply
These improvements are one of the things specifically called out on the submitted page
reply
yeah, it seems to me like Gemini is a little behind on the current RL patterns and also they dont seem interested in really creating a dedicated coding model. I think they have so much product surface (search, AI mode, gmail, youtube, chrome etc), they are prioritizing making the model very general. but who knows im just talking out of my ass.
reply
In other words: they just need to motivate their employees while giving in to finance's demands to fire a few thousand every month or so ...

And don't forget, it's not just direct motivation. You can make yourself indispensable by sabotaging or at least not contributing to your colleagues' efforts. Not helping anyone, by the way, is exactly what your managers want you to do. They will decide what happens, thank you very much, and doing anything outside of your org ... well there's a name for that, isn't there? Betrayal, or perhaps death penalty.

reply