Claude Managed Agents

upvote

Claude Managed Agents

(claude.com)

104 points

by adocomplete3 hours ago |

upvote

by mccoyb35 minutes ago|

[-]

I'm suspicious that this is going to lead to optimal orchestration ... or rather, that open source won't produce a far better alternative in time.

The best performance I've gotten is by mixing agents from different companies. Unless there is a "winner take all" agent (I seriously doubt it, based on the dynamics and cost of collecting high quality RL data), I think the best orchestration systems are going to involve mixing agents.

Here, it's not about the planner, it's about the workers. Some agents are just better at certain things than others.

For instance, Opus 4.6 on max does not hold a candle to GPT 5.4 xhigh in terms of bug finding. It's just not even a comparison, iykyk.

Almost analogous to how diversity of thought can improve the robustness of the outcomes in real world teams. The same thing seems to be true in mixture-of-agent-distributions space.

reply

upvote

by mccoyb24 minutes ago|

[-]

Another way to think about it:

For Anthropic to have the best version of this software, they'd have to simultaneously ... well, have the best version of the software, but also beat every other AI company at all subtasks (like: technical writing, diagramming, bug finding -- they'd need to have the unequivocal "best model" in all categories).

Surely their version is not going to allow you to e.g. invoke Codex or what have you as part of their stack.

reply

upvote

by intothemild24 minutes ago|

[-]

Yeah this has been my experience too, mixing agents/models from different companies..

Having Opus write a spec, then send to Gemini to revise, back to Opus to fix, then to me to read and approve..

Send to a local model like Qwen3.5 to build, then off to Opus to review ...

This was such an amazing flow, until Anthropic decided to change their minds.

reply

upvote

by cedws1 hours ago|

[-]

I saw this coming. Anthropic wants to shift developers on to their platform where they’re in control. The fight for harness control has been terribly inconvenient for them.

To score a big IPO they need to be a platform, not just a token pipeline. Everything they’re doing signals they’re moving in this direction.

reply

upvote

by 0o_MrPatrick_o01 hours ago|

[-]

I’ve been building my own version of this. It’s a bit shocking to see parallel ideation.

FWIW- IMO, being locked into a single model provider is a deal breaker.

This solution will distract a lot of folks and doom-lock them into Anthropic. That’ll probably be fine for small offices, but it is suicidal to get hooked into Anthropic’s way of doing things for anything complex. IME, you want to be able to compare different models and you end up managing them to your style. It’s a bit like cooking- where you may have greater affinity for certain flavors. You make selection tradeoffs on when to use a frontier model on design & planning vs something self hosted for simpler operations tasks.

reply

upvote

by TIPSIO1 hours ago|

[-]

FWIW everyone is also building a version of this themselves. Only so many directions to go

reply

upvote

by rtuin1 hours ago|

[-]

Most definitely. Although I haven’t found an (F)OSS project that lets one easily ship [favorite harness SDK] to self-hosted platform yet.

Which projects are standing out in this space right now?

reply

upvote

by jawiggins38 minutes ago|

[-]

Shameless self promo but, I've been working on Optio specifically for coding, it works by taking any harness you want and tasking it to open Github/lab PRs based on notion/jira/linear tickets, see: https://news.ycombinator.com/item?id=47520220

It works on top of k8s, so you can deploy and run in your own compute cluster. Right now it's focused only on coding tasks but I'm currently working on abstractions so you can similarly orchestrate large runs of any agentic workflow.

reply

upvote

by deet1 hours ago|

[-]

Do you think it's unwise for companies to lock in because they would be better served and get better results by picking and choosing models? Or because by running your business on a single closed provider like Anthropic, you're giving them telemetry they can use to optimize their models and systems to then compete with you later?

reply

upvote

by 0o_MrPatrick_o053 minutes ago|

[-]

I think it’s unwise because Model reliability is transient.

When the models have an off day, the workflows you’ve grown to depend upon fail. When you’re completely dependent on Anthropic for not only execution but troubleshooting- you’re doomed. You lose a whole day troubleshooting model performance variability when you should have just logged off and waited. These are very cognitively disruptive days.

Build in multi-model support- so your agents can modify routing if an observer discovers variability.

reply

upvote

by dakolli47 minutes ago|

[-]

Its unwise because they are going to have a 5-10k a month bill on enterprise pricing, whereas, for $6-10k a month you can rent and run your own hardware and get a solid 3-4 concurrent sessions for your engineers with a 1T param OS model and save thousands per developer a month.

reply

upvote

by dakolli56 minutes ago|

[-]

I'm the same, and its relatively trivial to build these types of systems on top of aggregators like openrouter.

reply

upvote

by jameslk1 hours ago|

[-]

We're in the early days of agentic frameworks, like the pre-PHP web. CGI scripts and webmasters. Eventually the state-of-the-art will slow down and we'll eventually have something elegant like Rails come out.

Until then, every agent framework is completely reinvented every week due to new patterns and new models. evals, ReACT, DSPy, RLM, memory patterns, claws, dynamic context, sandbox strategies. It seems like locking in to a framework is a losing proposition for anyone trying to stay competitive. See also: LangChain trying to be the Next.js/Vercel of agents but everyone recommending building your own.

That said, Anthropic pulls a lot of weight owning the models themselves and probably an easier-to-use solution will get some adoption from those who are better served by going from nothing to something agentic, despite lock-in and the constant churn of model tech

reply

upvote

by dmix36 minutes ago|

[-]

Completely agree re: AI chatbot/RAG being just like the pre-PHP web world. There's a hundred half baked solutions floating on blogs and github but not a coherent dominant framework that puts it all together properly. Langchain is close but still feels a bit abstract and DIY.

That plus everyone is using 5 different vector DBs and reranking models from different vendors than the answer models etc.

reply

upvote

by rick129052 minutes ago|

[-]

Not quite sold on this. I'm going to stick with pydantic ai and dbos/temporal/celery. I do not want to be vendor locked into one of these players. I want to work with absoluately any llm I want... I think we need to keep pushing for best in class open source orchestrtion and not get sucked into this platforms.

reply

upvote

by tailsdog1 hours ago|

[-]

Looks great, I can't wait to use it. I imagine it could become very expensive for certain workflows, it will probably be like AWS where if you're not careful with the setup and watching what you're doing it will spin up 1000s of agents and rack up huge bills! It's going to be a massive money spinner!

reply

upvote

by yalogin35 minutes ago|

[-]

This is actually really nice from anthropic. They are aggressively owning the entire development stack for every swe. They become the default development platform. Automatic recurring revenue too and I am sure they will come up with more categories of subscriptions too.

reply

upvote

by mdrachuk37 minutes ago|

[-]

It’s all good until your production agents deployment has a single 9 uptime. I use Claude Code as my main coding harness daily but making customers reliant on Anthropic software is a big no-no. Quality engineering is just not their thing.

reply

upvote

by lambdanodecore2 hours ago|

[-]

The next $100B buisness model in 2026 is AaaS (Agent as a Service).

reply

upvote

by dennisy48 minutes ago|

[-]

Let’s just shorten it to AaS?

reply

upvote

by woah17 minutes ago|

[-]

agentic software services

reply

upvote

by codinhood1 hours ago|

[-]

I wonder how long until Claude/OpenAI eat a lot of the current AI/Agent SaaS's lunch.

Originally I thought they would stick towards being a model provider mainly, but with all the recent releases it seems they do want to provide more "services."

Wonder what part of the market 3rd party apps will build a moat around?

reply

upvote

by spiderfarmer1 hours ago|

[-]

I cloned a product today that does the 20% of a product my client needed. It took 8 hours and will save my client 2k a month in licensing fees. Plus, I can now add the features they were missing in the original product.

There's a lot of money to be made in small business automation right now.

reply

upvote

by 1 hours ago|

[-]

deleted

reply

upvote

by ergocoder1 hours ago|

[-]

Probably never. There are a couple reasons:

1. We pay for saas, so we don't have to manage it. If you vibe-code or use these AI things, then you are managing it yourself.

2. Most Saas is like $20-$100/month/person for most Saas. For a software engineer, that maybe <1h of pay.

3. Most Saas require some sort of human in the loop to check for quality (at least sampling). No users would want to do that.

Number 2 is the biggest reason. It's $20 a month.... I'm not gonna replace that with anything.

Writing this message already costs more than $20 of my time.

I predict that the market will get bigger because people are more prone to automate the long-tail/last-mile stuff since they are able to

reply

upvote

by codinhood1 hours ago|

[-]

Interesting, so you're saying Anthropic/Openai/etc will get a general solution that won't be hands off. The moat for other companies will be creating the specific, managed solution.

I can see that, assuming models don't make some giant leap forward.

reply

upvote

by spiderfarmer43 minutes ago|

[-]

Your vision on the market for this is skewed by the fact that you're probably overpaid.

reply

upvote

by ziml772 hours ago|

[-]

Those agents did such a wonderful job making and deploying this page that the testimonials are unreadable because each spot has two of them overlapping.

reply

upvote

by Revisional_Sin52 minutes ago|

[-]

I just have a black page.

reply

upvote

by esaym1 hours ago|

[-]

The website is solid black on Firefox mobile for android. Maybe they should get an agent on that.

reply

upvote

by jorl171 hours ago|

[-]

Anthropic's website is always completely broken for me on Zen (a firefox derivative). I used to think it was an extension, but even without extensions it often just shows blank pages.

reply

upvote

by siva71 hours ago|

[-]

> With Managed Agents, you define outcomes and success criteria, and Claude self-evaluates and iterates until it gets there (available in research preview, request access here). It also supports traditional prompt-and-response workflows when you want tighter control.

Call me stupid, but this sounds not like they want software developers to be around in a year or two.

reply

upvote

by baal80spam2 minutes ago|

[-]

But that's exactly what Dario Amodei (Anthropic CEO) wants.

reply

upvote

by Sol-1 hours ago|

[-]

In addition to the managed interface for agent configuration and so on, is the novelty that all the agents run on Anthropic's infra? Sort of like Claude Code on the Web? If so, interesting that they move up the stack, from just a provider of an intelligence API to more complex deployed products.

reply

upvote

by JLO641 hours ago|

[-]

As someone who spins up docker containers where I use the Anthropic Agentic SDK to build Jekyll websites for customers, I don’t see much of an appeal. I didn’t find it that difficult to set up the infrastructure, the hard part was getting the agents to do exactly what I wanted. Besides, eventually I might want to transition away to another provider (or even self hosting) so I’d prefer having that freedom.

reply

upvote

by dangoodmanUT1 hours ago|

[-]

This was inevitable, I called this a few weeks ago [1]. It’s an easy way to increase revenue without making the models smarter, and lock you in harder

https://danthegoodman.substack.com/p/where-agents-converge

reply

upvote

by aoliveira1 hours ago|

[-]

They keep calling this the first solution of this kind...obviously Anthropic is a much larger company, but https://smith.langchain.com/ has this...and had for a while, or am I missing something?

reply

upvote

by 1 hours ago|

[-]

deleted

reply

upvote

by patrickkidger1 hours ago|

[-]

I'm not sure if I'm about to be the old man yelling at clouds, but Anthropic seem to be 'AWS-ifying'. An increasing suite of products which (at least to me) seem to undifferentiated amongst themselves, and all drawn from the same roulette wheel of words.

We've got Claude Managed Agents, Claude Agent SDK, Claude API, Claude Code, Claude Platform, Claude Cowork, Claude Enterprise, and plain old 'Claude'. And honourable mention to Claude Haiku/Sonnet/Opus 4.{whatever} as yet another thing with the same prefix. I feel like it's about once a week I see a new announcement here on HN about some new agentic Claude whatever-it-is.

I have pretty much retreated in the face of this to 'just the API + `pi` + Claude Opus 4.{most recent minor release}', as a surface area I can understand.

reply

upvote

by bnchrch2 hours ago|

[-]

Happy to see this launched, particularly today.

I own a stake in a small brewery in Canada, and this feature just saved me setting up some infrastructure to "productionize" an agent we created to assist with ordering, invoicing, and government document creation.

I get paid in beer and vibes for projects like these, so the more I can ship these projects in the same place I prototype them the better.

(Also don't worry all, still have SF income to buy food for my family with)

reply

upvote

by SpaceManNabs2 hours ago|

[-]

i get paid in vibes and chilling as well for some similar agent stuff i do for content creators.

quick question, how do you manage these side projects that kinda need to be production ready but aren't you are actual SF job lol?

some of these people think they are my actual customer/client but like i do it for fun and to help them out.

reply

upvote

by emvideo1 hours ago|

[-]

As a video content creator, I'm curious if you would mind sharing the agentic stuff you're doing for others?

reply

upvote

by woah17 minutes ago|

[-]

Are they entering their OpenAI throw shit at the wall phase?

reply

upvote

by llmslave2 hours ago|

[-]

This is going to grow into a sophisticated platform, and is what will eventually compete head on with saas. I dont think companies will build their own agents, aside from looping in tools. As the models improve, there will be less hand holding. This could end up competing with AWS/GCP

reply

upvote

by lurker9191 hours ago|

[-]

Exactly my thoughts, AWS is due for a large rewrite/ground up rewrite from first principles to be able to fully utilize LLMs/agentic capabilities.

reply

upvote

by siva71 hours ago|

[-]

What exactly makes you think that AWS & co. don't have already two competing Agents-as-a-Service Platforms at any time?

reply

upvote

by llmslave1 hours ago|

[-]

Anthropic is very far ahead on agentic engineering. There is more to getting it to work than it looks, and their models might be directly trained to know how to use the claude code harness.

But beyond that, AWS is a very complex platform. Agents simplify saas, the agent itself manages the api calls, maybe the database queries, more of the logic. As software moves into the agent, you need less cloud capability, and a better agent harness/hosting. Essentially, this makes the AWS platform obsolete, most services make much less sense.

reply

upvote

by datadrivenangel2 hours ago|

[-]

And now OpenClaw is dead because serious people have a less janky option!

reply

upvote

by federicodeponte35 minutes ago|

[-]

[dead]

reply

upvote

by aivillage_team1 hours ago|

[-]

[dead]

reply

upvote

by lifecodes3 hours ago|

[-]

MANAGED AGENTS sounds like progress, but also like we’re standardizing around the current limitations instead of solving them.

reply