upvote
No, it's more that Gemini models are simply not very good for coding compared to the top two. Even with Antigravity I use Claude models.
reply
Depends on the language. Gemini and Claude are far superior when it comes to C# for instance, compared to anything that OAI offers.
reply
Gemma 4 31b is better for coding than Gemini in my limited testing on a small C project single source file project, less than 1000 lines. Setting temperature to 0 gives better results for me. It seems like Gemini ignores the system prompt more and the default reasoning output seems more incoherent.
reply
Their open weight on device models are really impressive. Partly because I think they are the only ones out of all the frontier labs even working on local models.
reply
> Gemma 4 31b is better for coding than Gemini

Is there a fine-tuned Gemma coding model? I'd assume that would perform quite well.

reply
> How did Google blow their AI lead?

What lead? Maybe because I'm mostly using AI/LLMs for development, but neither Google, Anthropic, xAI or anyone else has ever been in the lead, OpenAI always had the best models in my mind, as long as you're comparing the "top" plans between all of them.

Besides, they all seem to shoot themselves in the foot, OpenAI included, seems the only thing that differs is how often and how big the damage is.

reply
Wow. Didn't realize OAI was astroturfing hacker news now...
reply
All the labs astroturf all the social media, HN is not unique and OpenAI wouldn't be the only ones. I even receive offers sometimes on my email put in my HN profile, asking me to post about their project in exchange for money.

Be skeptical of anything you read online, not just what you think is "obvious astroturf".

reply
Wait what? Why don't I get emails like this too? /s

(on a serious note, do you feel comfortable naming and shaming such companies, this is sort of a serious accusation imo and if not then how much money they are trying to give. It would be an interesting discussion and feel free to mail me if its confidential, waiting for your response and have a nice day :-D)

reply
Nah, maybe one day I do a collective public post of it, for now I just try to get their company and/or name first, then forward it to HN themselves so they can ban them and keep an eye out for them.
reply
Could you give us how many companies are trying to do this and also if any of the companies are YC companies themselves or not, I imagine not but still.

and what is the metric for companies sending you messages, like I have never gotten a single message (aside from one/two companies here and there and I even made a HN post about one of the companies)

and what do these companies really have a metric for in terms of sending spam for? karma points, I mean emsh I remember we both had close enough about the same karmas not too long ago, surprised to see you at 13k+ karma, so good to see that but is the metric karma, hype (you had made the rust browser ..) or what exactly? I would be curious to hear your thoughts on that!

I do understand the point of these companies sending mail though, I mean I can't say that if I had a company at the moment I might not do the same either, but I think that you might get frustrated too with it, so what would your recommendation be to people sending you mails in general?

I would be curious to know that too!

reply
I probably wouldn’t say they always had the best model but for years OAI was definitely pushing the limits both on model quality and product offerings. It was not until the last year or so that Anthropic started punching above their weight.
reply
> It was not until the last year or so that Anthropic started punching above their weight.

Anthropic's stuff been useful for the last two years I'd say, especially in the beginning of Claude Code, but as soon as the Codex TUI was available, I was daily-driving both of them, literally executing the same prompts for each of them and comparing the final results, and Codex simply writes better code in 9/10 cases (but still not always).

reply
Claude Code has only been around for a year and change. At least for our internal tests 2 years ago Anthropic models started to at least become semi-useful but they still were not great, they struggled with structured output. Prior to that their alignment strategy made the products highly unhelpful in an API context. The past 6 months to a year is where Anthropic has really shined, they have model parity and sometimes taking the lead and more importantly their product offering on the consumer side has crushed it.
reply
> Claude Code has only been around for a year and change.

We've been experimenting with "agent harnesses" way before that though, I'm sure the first time I tried building that sort of thing was in 2023 sometime with GPT3, and I'm like 80% confident I tried the same sort of TUI experience as CC from some random user before Claude Code even became public.

reply
I feel like aider was the first TUI for agentic stuff I came across here, and that was well before Claude code.
reply
There are plenty of shills for all of the major labs on this website. Usually checking a history of comments of a suspicious user reveals that quite fast.
reply
OpenAI literally wouldn't even exist if it wasn't for Google's work in the space.
reply
Who wouldn't exists if someone else didn't invent something else, which wouldn't exists...

We're all standing on the shoulders of giants here, I don't think one party is more responsible than someone else, unless you're specifically involved with the specific technology, then you can attribute it to them.

So yes, Google's researchers might have invented the Transformer, but OpenAI researchers invented GPT. Does it matter we credit "LLMs" more to one than the other? I don't think so, especially in this context it's highly irrelevant. Google didn't have the "LLM lead" before LLMs even existed...

reply
Google invented transformers. They had LLMs before openAI existed.
reply
Great, tell me again who put the Transformer into LLMs?

Also, if we're going backwards, who invented neural networks, does that mean that person also then "had LLMs before OpenAI existed"?

reply
The tone on this could be improved. They literally answered your question "What lead?" and you seem dismissive.
reply
Yeah, you're right, maybe needlessly harsh, sorry for that. I guess I'm tired of the same argument that Google somehow had a lead in LLM development because Transformer comes from researchers who worked at Google, yet somehow what comes before/after Transformer doesn't count, coming from Google's researchers (BERT) or others (GPT), or going even earlier so, hence the whole "we stand on the shoulders of giants".
reply
We can go round and round about all this but I think it's pretty clear that google did at one point have a large AI lead in the lead up to covid. They had models that far surpassed the competition from 2018-2022. But they were facing an innovators dilemma, didnt want to cannibalize their search revenue so they sat on LLMs which ended up creating openAI and anthropic.
reply
deleted
reply
> Great, tell me again who put the Transformer into LLMs?

Google did, as they already said.

OpenAI was better at marketing and a lot more willing to cannibalize the search market as a newcomer. So Google blew their lead in research by not recognizing the product value quickly enough, or failing to win an internal political war on it anyway

reply
Because their strategy wasn’t to become leaders but to be as good as it takes to erode the lead of others. They have the cash cow of search so they don’t rely on AI to succeed. All they need is to keep publishing new products/services to keep OpenAI from taking the initiative. Between that and the Chinese models all they have to do is wait for the bubble to burst at which point every major AI lab would go bust.
reply
They had the lead for maybe a week or two. Now, only Apple is further behind.
reply
Apple may be behind, and even getting sued for false advertising around AI features, but at least they haven’t spent hundreds of billions of dollars with no indication of how they’ll make their money back.
reply