upvote
I'm also experimenting with it more and more. Now I'm trying to create a 2D side-scrolling shooter with it, running in the browser. When it was relatively small, it did a good job. As the codebase and docs/ files that I'm using get larger it starts hallucinating, especially when the context gets at about 50% usage (Codex w/ gpt5.5). As in, it'll literally forget to update parts of the code.

e.g, I change velocity of player to '200' and of bullets to '300', and it only updated the bullet velocity. Then told me the player was already 'at the correct value' even though it was set to 150. Things like that.. :)

reply
For me, unless there is a concrete way of proving work is correct you can't rely on AI coding. tsz has super strict tests around correctness, performance and architectural boundaries
reply
If I understood you correctly, I think I'm less extreme than that. Most code written by humans is also not provably correct. But I'm assuming you mean provably correct like Lean: https://lean-lang.org/, and not just "passes tests".

If you mean 'passes tests', that can be tackled by AI. Although AI writing its own tests and then implementing its own code is definitely not a foolproof strategy.

reply
More or less. The tsz solver is pure enough (it doesn't know about the AST) that it might be possible to formally validate it. But in my case I am lucky with tsc baseline. Anything that produces different output than tsc is a bug
reply
>25k commits in 4 months or about 1 commit every 7 minutes

How do you manage/orchestrate this? I'm genuinely curious.

reply
Multiple computers and each multiple Claude Code or Codex sessions. It had lots of ups and downs. Now I have a good enough test harness that makes it easier to iterate faster
reply
Do you not run out of things to code?
reply
Code is not the goal. What code does is.
reply