undefined

upvote

points

by paxys5 hours ago |

upvote

by TheAceOfHearts3 hours ago|

[-]

This comment is a bit confusing and surprising to me because I tried Antigravity three weeks ago and it was very undercooked. Claude was actually able to identify bugs and get the bigger picture of the project, while Gemini 3 with Antigravity often kept focusing on unimportant details.

My default everyday model is still Gemimi 3 in AI Studio, even for programming related problems. But for agentic work Antigravity felt very early-stages beta-ware when I tried it.

I will say that at least Gemimi 3 is usually able to converge on a correct solution after a few iterations. I tried Grok for a medium complexity task and it quickly got stuck trying to change minor details without being able to get itself out.

Do you have any advice on how to use Antigravity more effectively? I'm open to trying it again.

reply

upvote

by paxys2 hours ago|

[-]

Ask it to verify stuff in the browser. It can open a special Chrome instance, browse URLs, click and scroll around, inspect the DOM, and generally do whatever it takes to verify that the problem is actually solved, or it will go back and iterate more. That feedback loop IMO makes it very powerful for client-side or client-server development.

reply

upvote

by Analemma_2 hours ago|

[-]

I've mentioned this before, but I think Gemini is the smartest raw model for answering programming questions in chatbot mode, but these CC/Codex/gemini-cli tools need more than just the model, the harness has to be architected intelligently and I think that's where Google is behind for the moment.

reply

upvote

by jug30 minutes ago|

[-]

I've heard Opus 4.5 might have an edge especially in long running agentic coding scenarios (?) but personally yes Gemini 3 series is what I was expecting GPT-5 to be.

I'm also mostly on Gemini 3 Flash. Not because I've compared them all and I found it the best bar none, but because it fulfills my needs and then some, and Google has a surprisingly little noted family plan for it. Unlike OpenAI, unlike Anthropic. IIRC it's something like 5 shared Gemini Pro subs for the price of 1. Even being just a couple sharing it, it's a fantastic deal. My wife uses it during studies, I professionally with coding and I've never run into limits.

reply

upvote

by jckahn5 hours ago|

[-]

Yeah I don't understand why everyone seems to have forgotten about the Gemini options. Antigravity, Jules, and Gemini CLI are as good as the alternatives but are way more cost effective. I want for nothing with my $20/mo Google AI plan.

reply

upvote

by paxys5 hours ago|

[-]

Yeah I'm on the $20/mo Google plan and have been rate limited maybe twice in 2 months. Tried the equivalent Claude plan for a similar workload and lasted maybe 40 minutes before it asked me to upgrade to Max to continue.

reply

upvote

by codazoda2 hours ago|

[-]

It's crazy that we're having such different experiences. I purchased the Google AI plan as an alternative to my ChatGPT (Codex) daily driver. I use Gemini a fair amount at work, so I thought it would be a good choice to use personally. I used it a few times but ran into limits the first few projects I worked on. As a result I switched to Claude and so, far, I haven't hit any limits.

reply

upvote

by 4 hours ago|

[-]

deleted

reply

upvote

by riku_iki1 hours ago|

[-]

Google has uncertain privacy settings, there is no declaration they won't train their LLM on your personal/commercial code.

reply

upvote

by codazoda3 hours ago|

[-]

I've used Gemini CLI a fair amount as well—it's included with our subscription at work. I like it okay, but it tends to produce "lies" a bit too often. It tends to produce language that reads as over confident that it's found a problem or solution. This causes me extra work to verify or causes me extra time because I believed it. In my experience Claude Code does this quite a bit less.

reply

upvote

by pRusya5 hours ago|

[-]

It's the opposite experience for me. Gemini mostly produces made up and outdated stuff.

reply

upvote

by whalee5 hours ago|

[-]

I think counter to the assumption of myself (and many), for long form agent coding tasks, models are not as easily hot swappable as I thought.

I have developed decent intuition on what kinds of problems Codex, Claude, Cursor(& sub-variants), Composer etc. will or will not be able to do well across different axes of speed, correctness, architectural taste, ...

If I had to reflect on why I still don't use Gemini, it's because they were late to the party and I would now have to be intentional about spending time learning yet another set of intuitions about those models.

reply

upvote

by codazoda2 hours ago|

[-]

I feel like "prompting language" doesn't translate over perfectly either. It's like we become experts at operating a particular AI agent.

I've been experimenting with small local models and the types of prompts you use with these are very different than the ones you use with Claude Code. It seems less different between Claude, Codex, and Gemini but there are differences.

It's hard to articulate those differences but I think that I kind of get in a groove after using models for a while.

reply

upvote

by OsrsNeedsf2P4 hours ago|

[-]

For all the hype I see about Gemini, we integrated it with our product (an AI agent) and it consistently performs worse[0] than Claude Sonnet, Opus, and ChatGPT 5.2

[0] based on user Thumbs up/Thumbs down voting

reply

upvote

by qaq5 hours ago|

[-]

Maybe it's the types of projects I work on but Gemini is basically unusable to me. Settled on Claude Code for actual work and Codex for checking Claude's work. If I try to mix in Gemini it will hallucinate issues that do not exist in code at very high rate. Claude and Codex are way more accurate at finding issues that actually exist.

reply

upvote

by aantix1 hours ago|

[-]

Not my experience at all.

It fails to be pro-active. "Why didn't you run the tests you created?"

I want it to tell me if the implementation is working.

Feels lazy. And it hallucinates solutions frequently.

It pales in comparison to CC/Opus.

reply

upvote

by zhengyi131 hours ago|

[-]

I feel like this is exactly the use case for things like Hooks and Skills. Which, if you don't want to write them yourself, I get it. But I do think we can get the tool to do it; sounds like you want it doing that a little more actively/out-of-the-box?

reply

upvote

by notatoad3 hours ago|

[-]

I can think of one major reason why Microsoft and Apple would prefer to feed their codebases into Claude than to Gemini.

reply

upvote

by psyclobe4 hours ago|

[-]

I tried to use it, kept saying it was at max capacity and nothing would happen. I gave it a good day before giving up.

reply

upvote

by CuriouslyC4 hours ago|

[-]

Oddly enough, as impressive as Gemini 3 is, I find myself using it infrequently. The thing Gemini 2.5 had over the other models was dominance in long context, but GPT5.2-codex-max and Opus 4.5 Thinking are decent at long context now, and collectively they're better at all the use cases I care about.

reply

upvote

by 5 hours ago|

[-]

deleted

reply

upvote

by ralusek5 hours ago|

[-]

I think Gemini is an excellent model, it's just not a particularly great agent. One of the reasons is that its code output is often structured in a way that looks like it's answering a question, rather than generating production code. It leaves comments everywhere, which are often numbered (which not only is annoying, but also only makes sense if the numbering starts within the frame of reference of the "question" it's "answering").

It's also just not as good at being self-directed and doing all of the rest of the agent-like behaviors we expect, i.e. breaking down into todolists, determining the appropriate scope of work to accomplish, proper tool calling, etc.

reply

upvote

by freedomben5 hours ago|

[-]

Yeah, you may have nailed it. Gemini is a good model, but in the Gemini CLI with a prompt like, "I'd like to add <feature x> support. What are my options? Don't write any code yet" it will proceed to skip right past telling me my options and will go ahead an implement whatever it feels like. Afterward it will print out a list of possible approaches and then tell you why it did the one it did.

Codex is the best at following instructions IME. Claude is pretty good too but is a little more "creative" than codex at trying to re-interpret my prompt to get at what I "probably" meant rather than what I actually said.

reply

upvote

by phainopepla21 hours ago|

[-]

Try the conductor extension for gemini-cli: https://github.com/gemini-cli-extensions/conductor

It won't make any changes until a detailed plan is generated and approved.

reply

upvote

by PantaloonFlames1 hours ago|

[-]

I've had the exact opposite experience. After including in my prompt "don't write any code yet" (or similar brief phrase), Gemini responds without writing code.

Using Gemini 2.5 or 3, flash.

reply

upvote

by michaelcampbell4 hours ago|

[-]

Can you (or anyone) explain how this might be? The "agent" is just a passthrough for the model, no? How is one CLI/TUI tool better than any other, given the same model that it's passing your user input to?

I am familiar with copilot cli (using models from different providers), OpenCode doing the same, and Claude with just the \A models, but if I ask all 3 the same thing using the same \A model, I SHOULD be getting roughly the same output, modulo LLM nondeterminism, right?

reply

upvote

by taylorius59 minutes ago|

[-]

maybe different preparatory "system" prompts?

reply

upvote

by sutterd4 hours ago|

[-]

My go-to models have been Claude and Gemini for a long time. I have been using Gemini for discussions and Claude for coding and now as an agent. Claude has been the best at doing what I want to do and not doing what I don’t want to do. And then my confidence in it took a quantum leap with Opus 4.5. Gemini seems like it has gotten even worse at doing what I want with new releases.

reply

upvote

by bastawhiz5 hours ago|

[-]

I've never, ever had a good experience with Gemini (3 Pro). It's been embarrassingly bad every time I've tried it, and I've tried it lots of times. It overcomplicates almost everything, hallucinates with impressive frequency, and needs to be repeatedly nudged to get the task fully completed. I have no reason to continue attempting to use it.

reply

upvote

by JoshMandel3 hours ago|

[-]

Same. Sometimes even repeated nudges don't help. The underlying 3.0 Pro model is great to talk and ideate with, but its inability to deliver within the Gemini CLI harness is ... almost comical.

reply

upvote

by mfro5 hours ago|

[-]

For me it just depends on the project. Sometimes one or the other performs better. If I am digging into something tough and I think it's hallucinating or misunderstanding, I will typically try another model.

reply

upvote

by satvikpendem5 hours ago|

[-]

Eh, it's not near Opus at all, closer to Sonnet. It is nice though with Antigravity because it's free versus being paid in other IDEs like Cursor.

reply

upvote

by causal4 hours ago|

[-]

Yeah use Flash 3 for easy + fast stuff, but it can't hold the plot like Opus or Codex 5

reply

upvote

by TZubiri3 hours ago|

[-]

I don't think anyone is sleeping on it.

It's on the top of most leaderboards on lmarena.ai

reply

upvote

by jonathanstrange3 hours ago|

[-]

I'm also using Gemini and it's the only option that consistently works for me so far. I'm using it in chat mode with copy&paste and it's pleasant to work with.

Both Claude and ChatGPT were unbearable, not primarily because of lack of technical abilities but because of their conversational tone. Obviously, it's pointless to take things personally with LLMs but they were so passive-aggressive and sometimes maliciously compliant that they started to get to me even though I was conscious of it and know very well how LLMs work. If they had been new hires, I had fired both of them within 2 weeks. In contrast, Gemini Pro just "talks" normally, task-oriented and brief. It also doesn't reply with files that contain changes in completely unrelated places (including changing comments somewhere), which is the worst such a tool could possibly do.

Edit: Reading some other comments here I have to add that the 1., 2. ,3. numbering of comments can be annoying. It's helpful for answers but should be an option/parameterization.

reply

upvote

by bonesss1 hours ago|

[-]

I think you’re highlighting an aspect of agentic coding that’s undervalued: what to do once trust is breached… ?

With humans you can categorically say ‘this guy lies in his comments and copy pastes bullshit everywhere’ and treat them consistently from there out. An LLM is guessing at everything all the time. Sometimes it’s copying flawless next-level code from Hacker News readers, sometimes it’s sabotaging your build by making unit tests forever green. Eternal vigilance is the opposite of how I think of development.

reply

upvote

by tiangewu4 hours ago|

[-]

[dead]

reply

upvote

by catlover765 hours ago|

[-]

It's ok, but it too frequently edits WAY more than it needs to in order to accomplish the task at hand.

GPT-5.2 sometimes does this too. Opus-4.5 is the best at understanding what you actually want, though it is ofc not perfect.

reply

upvote

by dingnuts5 hours ago|

[-]

[dead]

reply