upvote
Is there any task that actually doesn't require human intervention in-between, even if its just to setup stuff?

Like I will get Opus to make me an app but it will stop in between because I need to setup the db and plug in the API keys and Opus really can't do that on its own yet

reply
> Is there any task that actually doesn't require human intervention in-between, even if its just to setup stuff?

The goal is none. The current situation: everything that matters requires human intervention.

I think the end situation will be that LLMs will be able to perform decently well in a highly controlled and predictable environment.

reply
Will Codex App support new context window, rather than compaction, for "unrelated" sub-tasks during long horizon tasks?
reply
Could be a great feature, can't wait to test! Tired of other models (looking at you Opus) constantly stuck mid-task lately.
reply
Interesting, I just had opus convert a 35k loc java game to c++ overnight (root agent that orchestrated and delegated to sub agents) and woke up and it's done and works.

What plan are you on? I'm starting to wonder if they're dynamically adjusting reasoning based on plan or something.

reply
I'm on max 5x and noticed this too. I don't use built-in subagents but rather full Claude session that orchestrates other full claude sessions. Worker agents that receive tasks now stop midway, they ask for permission to continue. My "heartbeat" is basically "status. One line" message sent to the orchestrator.

Opus 4.6 worker agents never asked for permission to continue, and when heartbeat was sent to orchestrator, it just knew what to do (checked on subagents etc). Now it just says that it waits for me to confirm something.

reply
Weird. I don't have this behavior, although I did with codex and 5.4 haha. I bet the providers are playing with settings underneath and different users are routed to different deployments, or they're secretly routing us to different models under load.
reply
This has to be bait.
reply
what?
reply
Because there’s no way in hell it can rewrite a game with 35k loc perfectly lol, link the codebase or it didn’t happen.
reply
I've been using the /ralph-loop plugin for claude code, works well to keep the model hammering at the task.
reply
It's genuinely so great at long horizon tasks! GPT-5.5 solved many long-horizon frontier challenges, for the first time for an AI model we've tested, in our internal evals at Canva :) Congrats on the launch!
reply
Can we not do growth hacking here?
reply
We totally agree.

That's what I've been heads down, HUNGRY, working on, looking for investors and founding engineers pst: https://heymanniceidea.com (disclaimer: I am not associated with heymanniceidea.com)

reply
HN is owned by a startup accelerator and venture capital firm. They do growth hacking on the front page. And you probably know that since your throwaway account is several years old.
reply
deleted
reply
Sorry, what is "heartbeats", exactly?
reply
> Today we launched heartbeats in Codex: automations that maintain context inside a single thread over time.

https://x.com/pashmerepat/status/2044836560147984461

reply
Thanks!
reply