upvote
> You can't "test" comments.

I'm thinking that we're approaching a world where you can both test for comments and test the comments themselves.

reply
Now that would be really interesting: prompt an LLM to find comments that misrepresent the code! I wonder how many false positives that would bring up?
reply
I have a Claude Code skill for adding, deleting and improving comments. It does a decent job at detecting when comments are out of date with the code and updating them. It's not perfect, but it's something.
reply
Literate programming failed because people treated long essays as the source of truth instead of testable artifacts.

Make prose runnable and minimal: focus narrative on intent and invariants, embed tiny examples as doctests or runnable notebooks, enforce them in CI so documentation regressions break the build, and gate agent-edited changes behind execution and property tests like Hypothesis and a CI job that runs pytest --doctest-modules and executes notebooks because agents produce confident-sounding text that will quietly break your API if trusted blindly.

reply
I don't buy that. Writing is taking a bad rap from all this. Writing _is_ a form of more intense reading. Reading on steroids, as they say. If reading is considered good, writing should be considered better.
reply
Writing in that draft style is really only useful because a) you read the results and b) you write an improved version at the end. Drafting forever is not considered "better" because someone (usually you) has to sift through the crap to find the good parts.

This is especially pronounced in the programming workplace, where the most "senior" programmers are asked to stop programming so they can review PRs.

reply
You're right that you can't test comments, but you can test the code they describe. That's what reproducibility bundles do in scientific computing ;; the prose says "we filtered variants with MAF < 0.01", and the bundle includes the exact shell command, environment, and checksums so anyone can verify the prose matches reality. The prose becomes a testable claim rather than a decorative comment. That said, I agree the failure mode of literate programming is prose that drifts from code. The question is whether agents reduce that drift enough to change the calculus.
reply