undefined

points

by prodigycorp4 hours ago |

comments

by pixel_popping1 hours ago|

[-]

failing non-stop at tool calls on top of that.

by refulgentis4 hours ago|

prev|

[-]

I'll cosign what you said, simultaneously, yr interlocutor's point is also well-founded and it depresses me it's not better known and sounds so...off...due to conventional wisdom x God King Zuck's misunderstanding his own company and resulting overreaction.

They beat Gemini 2.5 Flash and Pro handily on my benchmark suite. (tl;dr: tool calling and agentic coding).

Llama 4 on Groq was ~GPT 4.1 on the benchmark at ~50% the cost.

They shouldn't have released it on a Saturday.

They should have spent a month with it in private prerelease, working with providers.[1]

The rushed launch and ensuing quality issues got rolled into the hypebeast narrative of "DeepSeek will take over the world"

I bet it was super fucking annoying to talk to due to LMArena maxxing.

[1] my understanding is longest heads up was single-digit days, if any. Most modellers have arrived at 2+ weeks now, there's a lot between spitting out logits and parsing and delivering a response.

by alex11382 hours ago|

parent|

[-]

Your comments seem to imply the engineers made a great product but Zuck intervened so now it's shit

by refulgentis46 minutes ago|

parent|

[-]

I don't know how Zuck intervening could change float32s in a trained model, so I don't think I think that, but maybe I'm parsing your words incorrectly.