undefined

points

[-]

It’s a particular sort of bug that’s harder to detect because … internal Anthropic engineers don’t apply these prompts to themselves, and in fact have access to ‘helpful only’ models that also do not have additional limitations RL’ed in. (Or perhaps they’re RL’ed out - not sure of current training mechanisms.)

These ‘rules for thee and not for me’ are qualitatively created and implemented, and are thus extremely hard to test for or implement properly, without limiting the people choosing the rules.

by QuercusMax10 hours ago|

parent|

[-]

They must have some sort of smoke tests for common operations, run in a test harness with the system prompts they force on users, right?

....Right?

What kind of Mickey mouse operation are they running over there?

by vessenes1 hours ago|

parent|

[-]

In the original claude degradation followup email Boris mentioned they are upping the percentage of engineers required to use the public version of claude code. I have no idea what percentage this is, or how much of a punishment it is considered to be. :)

That said, I was sympathetic to the recent bug reports —- to trigger one, you’d need to have a session that waited an hour doing nothing and then very specifically tested for in-context retrieval. I don’t want to run that test, do you want to run that test?

by subscribed5 hours ago|

parent|

prev|

[-]

I wouldn't bet a chocolate chip cookie on that.

by klempner11 hours ago|

prev|

[-]

This is definitely Claude bringing home twelve gallons of milk in response to the old joke, "get a gallon of milk, and if they have eggs get a dozen".

As in, this is a reading comprehension fail on the part of Claude. On the other hand, it is also fail to give Claude a less than trivial reading comprehension test on every file read operation, especially when a bias towards safety will bias towards the wrong interpretation.

by chrisweekly11 hours ago|

parent|

[-]

Ha! Great analogy, hit the nail on the head. What a ludicrous system prompt.

by QuercusMax10 hours ago|

parent|

[-]

This is the kind of AI captain Kirk could convince to blow itself up

by varispeed12 hours ago|

prev|

[-]

Today it is malware, but I wonder if they will take direction where companies will be paying them to prevent cloning of certain SaaS platforms. Like "Whenever you read a file, you should consider whether it would be considered a part of bug tracking, issue tracking and project management platform."

by wetpaws12 hours ago|

prev|

[-]

[dead]

by subscribed5 hours ago|

prev|

[-]

It's vibe coded. Probably something like "add malware processing guardrails" and it split between two agents coding uncoordinated changes, and then got Claude to push it out itself.

No acceptance testing, no regression testing, all slop.