upvote
I've been increasingly "freaking out" since about 3 - 4 years ago and it seems that the pessimistic scenario is materializing. It looks like it will be over for software engineers in a not so distant future. In January 2025 I said that I expect software engineers to be replaced in 2 years (pessimistic) to 5 years (optimistic). Right now I'm guessing 1 to 3 years.
reply
I assure you it will soon become very clear that mass job losses are one of the least concerning side effects of developing the magic "everything that can plausibly been done within the constraints of physics is now possible" machine.

We're opening a can of worms which I don't think most people have the imagination to understand the horrors of.

reply
Do you have any sources I could read to better understand your concern?
reply
yeesh yep, though it's more Pandora's Box than a can of worms, since it can't exactly be closed once it's opened
reply
Anthropic needs to show that its models continually get better. If the model showed minimal to no improvement, it would cause significant damage to their valuation. We have no way of validating any of this, there are no independent researchers that can back any of the assertions made by Anthropic.

I don’t doubt they have found interesting security holes, the question is how they actually found them.

This System Card is just a sales whitepaper and just confirms what that “leak” from a week or so ago implied.

reply
The numbers only go up to 100% though.
reply
It's going to be expensive to serve (also not generally available), considering they said it's the largest model they've ever trained.

I suspect it's going to be used to train/distill lighter models. The exciting part for me is the improvement in those lighter models.

reply
What's interesting is that scaling appears to continue to pay off. Gwern was right - as always.
reply
It seems inevitable that costs will come down over time. Expensive models today will be cheap models in a few years.
reply
I am freaking out. The world is going to get very messy extremely quickly in one or two further jumps in capability like this.
reply
Messy in a way that would affect you?
reply
"Internet no longer viable" would affect everyone, probably
reply
"some model I don't get to use is much better at benchmarks"

pick one or more: comically huge model, test time scaling at 10e12W, benchmark overfit

reply
So... you're not excited because it might take a few months before we can use it or something? I don't get your comment.
reply
Whether you're excited depends on what do you do for living and how close you are to financial independence.
reply
I agree there are other valid reasons not to be excited about this, I just can't make sense of the ones provided above.
reply
I think the general question is if they'll release it at all, haven't yet read anything stating that they would
reply
Well let me introduce people to a few brand new concepts:

https://en.wikipedia.org/wiki/Capitalism

https://en.wikipedia.org/wiki/Race_to_the_bottom

https://en.wikipedia.org/wiki/Arms_race

Of course they'll release it once they can de-risk it sufficently and/or a competitor gets close enough on their tail, whichever comes first.

reply
I think there's no SOA advance on this one worthy of "freaking out".

Looks like they just built a way larger model, with the same quirks than Claude 4. Seems like a super expensive "Claude 4.7" model.

I have no doubts that Google and OpenAI already done that for internal (or even government) usage.

reply
deleted
reply
Freak out about what? I read the announcement and thought "that's a dumb name, they sure are full of themselves" – then I went back to using Claude as a glorified commit message writer. For all its supposed leaps, AI hasn't affected my life much in the real except to make HN stories more predictable.
reply
Well for one, it’s a PDF
reply
Wait until you see real usage. Benchmark numbers do not necessarily translate to real world performance (at least not by the same amount).
reply
the time to freak out was 2 years ago.
reply