undefined

points

by roxolotl12 hours ago |

comments

by password432110 hours ago|

[-]

> way worse than the average Cloudflare blog

How long has it been since you took your average? Lately all Cloudflare output has been heavily AI'd.

by JeremyNT9 hours ago|

prev|

[-]

> They claim it’s a different kind of tool and then describe using it the same way you’d use any other model. This really felt way worse than the average Cloudflare blog and really just rehashed the Mythos announcement which had already called out the key parts being chaining and crafting examples.

Hah, I was trying to parse this too.

Charitably perhaps they're being vague on exactly what's different because they're still under NDA.

by __natty__12 hours ago|

prev|

[-]

Sounds different because it’s hidden advertisement not a regular blog post

by grim_io12 hours ago|

parent|

[-]

But why would cloudflare advertise Anthropic? They are competing with Anthropic by hosting open weights models.

by Someone123412 hours ago|

parent|

[-]

https://www.cloudflare.com/press/press-releases/2025/cloudfl...

by neuronexmachina2 hours ago|

parent|

[-]

Can you elaborate? That just seems to be a Cloudflare's announcement from May 2025 that they'd be supporting MCP servers.

by hdndjsbbs12 hours ago|

parent|

prev|

[-]

It's circular financing. It's a circlejerk. It's a circular financing jerk.

by drak0n1c6 hours ago|

parent|

prev|

[-]

They got privileged early access to an unreleased frontier model to harden their systems, with Anthropic engineer support, and likely were able to use it to make other product optimizations tangential to security too. A blog article afterwards is a cheap price for unlocking that access, regardless of how well it paid off.

by getnormality5 hours ago|

prev|

[-]

I think they're saying it has qualitatively different capabilities that make certain kinds of security work more worth pursuing with the model, not that the model of human-AI interaction has changed.

You're right that they're using a harness like everyone else. The general idea of giving the model a harness is not going to change. I mean even humans need harnesses to accomplish some things.

by mycall2 hours ago|

parent|

[-]

Google Maps is my favorite human harness.

by samstokes8 hours ago|

prev|

[-]

The post says they wrote a custom harness that orchestrates work between multiple separate model invocations. That is different from running Claude Code (which is a specific existing harness around the Claude models).

The post takes a while to get around to saying that, and could have included more detail besides the workflow diagram and table (which they flag as only "an example of" such a harness), but it does answer the question. It's a different kind of tool because it's a model rather than a harness+model pair.

by meander_water8 hours ago|

prev|

[-]

> the model has its own emergent guardrails that sometimes cause it to push back on legitimate security research requests. But as we found, these organic refusals aren’t consistent - the same task, framed differently or presented in a different context, could produce completely different outcomes as illustrated in the examples below.

This was new. I'm surprised that a model specifically designed for security research and gated to professionals is refusing legitimate requests

by _alternator_7 hours ago|

parent|

[-]

There's pretty strong evidence that (mis)alignment in one area creates (mis)alignment in others. The "aligned behavior" vectors are not orthogonal from cybersecurity to bioweapons to prejudice, so having alignment in some will likely bleed into others.

by smusamashah10 hours ago|

prev|

[-]

'Its not X, its Y' is also a common LLM trope.

by eikenberry11 hours ago|

prev|

[-]

My guess is because it is a model trained specifically for security/hacking. So comparing it to Opus, trained for chat/code/etc., is apples-to-oranges.

by rs_rs_rs_rs_rs10 hours ago|

parent|

[-]

It is not, that's what surprised Anthropic employees too.

by FergusArgyll12 hours ago|

prev|

[-]

I think what they might mean is:

Because of it's capabilities, a new kind of harness can be built for it, thus the entire system (model + harness) is a different kind of tool than say Claude code

by Xirdus12 hours ago|

parent|

[-]

But did they build this different harness? And are they sure other models can't cope with it?

by roxolotl12 hours ago|

parent|

[-]

Right I expected the piece to transition into “and here’s how we built a whole new thing for it” but it never did.

by jascha_eng10 hours ago|

parent|

[-]

They kind of have a little diagram explaining the steps I imagine every single step in that to basically be it's own Claude code session.