undefined

points

by spaceman_20204 hours ago|

[-]

Is there any task that actually doesn't require human intervention in-between, even if its just to setup stuff?

Like I will get Opus to make me an app but it will stop in between because I need to setup the db and plug in the API keys and Opus really can't do that on its own yet

by stingraycharles1 hours ago|

parent|

[-]

> Is there any task that actually doesn't require human intervention in-between, even if its just to setup stuff?

The goal is none. The current situation: everything that matters requires human intervention.

I think the end situation will be that LLMs will be able to perform decently well in a highly controlled and predictable environment.

by thereeldeel5 hours ago|

prev|

[-]

Will Codex App support new context window, rather than compaction, for "unrelated" sub-tasks during long horizon tasks?

by dandaka18 hours ago|

prev|

[-]

Could be a great feature, can't wait to test! Tired of other models (looking at you Opus) constantly stuck mid-task lately.

by winrid18 hours ago|

parent|

[-]

Interesting, I just had opus convert a 35k loc java game to c++ overnight (root agent that orchestrated and delegated to sub agents) and woke up and it's done and works.

What plan are you on? I'm starting to wonder if they're dynamically adjusting reasoning based on plan or something.

by gck116 hours ago|

parent|

[-]

I'm on max 5x and noticed this too. I don't use built-in subagents but rather full Claude session that orchestrates other full claude sessions. Worker agents that receive tasks now stop midway, they ask for permission to continue. My "heartbeat" is basically "status. One line" message sent to the orchestrator.

Opus 4.6 worker agents never asked for permission to continue, and when heartbeat was sent to orchestrator, it just knew what to do (checked on subagents etc). Now it just says that it waits for me to confirm something.

by winrid7 hours ago|

parent|

[-]

Weird. I don't have this behavior, although I did with codex and 5.4 haha. I bet the providers are playing with settings underneath and different users are routed to different deployments, or they're secretly routing us to different models under load.

by adamandsteve14 hours ago|

parent|

prev|

[-]

This has to be bait.

by azan_13 hours ago|

parent|

[-]

Why?

by winrid7 hours ago|

parent|

prev|

[-]

what?

by adamandsteve21 hours ago|

parent|

[-]

Because there’s no way in hell it can rewrite a game with 35k loc perfectly lol, link the codebase or it didn’t happen.

by frotaur17 hours ago|

parent|

prev|

[-]

I've been using the /ralph-loop plugin for claude code, works well to keep the model hammering at the task.

by dannyw18 hours ago|

prev|

[-]

It's genuinely so great at long horizon tasks! GPT-5.5 solved many long-horizon frontier challenges, for the first time for an AI model we've tested, in our internal evals at Canva :) Congrats on the launch!

by brcmthrowaway16 hours ago|

parent|

[-]

Can we not do growth hacking here?

by RALaBarge15 hours ago|

parent|

[-]

We totally agree.

That's what I've been heads down, HUNGRY, working on, looking for investors and founding engineers pst: https://heymanniceidea.com (disclaimer: I am not associated with heymanniceidea.com)

by smallerize15 hours ago|

parent|

prev|

[-]

HN is owned by a startup accelerator and venture capital firm. They do growth hacking on the front page. And you probably know that since your throwaway account is several years old.

by 15 hours ago|

parent|

prev|

[-]

deleted

by bkyan15 hours ago|

prev|

[-]

Sorry, what is "heartbeats", exactly?

by gurjeet14 hours ago|

parent|

[-]

> Today we launched heartbeats in Codex: automations that maintain context inside a single thread over time.

https://x.com/pashmerepat/status/2044836560147984461

by bkyan13 hours ago|

parent|

[-]

Thanks!