undefined

points

[-]

If a coworker dumped a 5k-line code review on you, you'd tell them to come back when it's broken down into smaller, reviewable chunks. Large dumps of code are basically unreviewable by humans, but it seems like a lot of people have forgotten about that when it comes to LLMs.

by roncesvalles1 hours ago|

parent|

[-]

You aren't allowed to block PRs for being too large anymore. The objective that every engineer should be 2x/3x/5x more productive can only be achieved if you go totally lax on code reviews.

Because if all your SWEs produce 5x more code, it also means they have to review 5x more code. But LLMs don't really help with code reviews. Then it becomes a Metcalfian paradox unless you just rubberstamp PRs, which is what is expected of you.

by vanuatu36 minutes ago|

parent|

[-]

its pretty easy to point your terminal agent to your giant pr and ask it to break it up into small prs

if youre being asked to rubberstamp prs thats a management skill issue

by trjordan2 hours ago|

parent|

prev|

[-]

I think it's worse than that. At least if I dumped 5k LoC on somebody in 2021, you knew I spent the time to write it, so it's "fair" to ask you to read it. But I didn't write it in 2026, so you shouldn't read it.

I think it's less about "break it down" and more about "let's communicate at the same altitude."

I wrote a (bait-titled) post about it: https://tern.sh/blog/stop-reading-prs/

by fusslo2 hours ago|

parent|

[-]

113 files +22913 −2423

305 files +15075 −13110

153 files +21934 −8698

125 files +28120 −2398

43 files +11188 −63

118 files +21564 −647

These are the largest (6 of 35) in the past 30 days. added: 190079 removed: 39696 in the last 6 months

from one person.

by evdubs1 hours ago|

parent|

[-]

I hope 99% of that was documentation and testing.

by darth_aardvark2 hours ago|

parent|

prev|

[-]

Breaking up a giant PR can be a tedious, time-consuming hassle, and in the past I could sympathize in practice if someone had a giant PR they didn't have time to decompose once they got it working.

But it's also the exact sort of thing that LLMs are literally perfect for in my experience so there's really no excuse anymore. I've never seen Claude fail to turn a 5k PR into a well-decomposed Graphite stack.

by xmodem48 minutes ago|

parent|

[-]

Hell, I've hand-written a large PR as a single commit and then asked claude to break it down for me at least once. But I think the fact doing this task by hand is a tedious, time-consuming hassle is not because it inherently has to be but because the tooling for doing it has barely changed in the past 15 years.

by win311fwg2 hours ago|

parent|

prev|

[-]

It is not so much forgetting as much as it is acceptance that when welcoming AI into a codebase, the code can no longer matter; that all that matters is that the properties of the system are validated. That isn't a change that comes free, so nobody should be expecting magic, it is a different set of tradeoffs. There is no such thing as a panacea.

by hootz2 hours ago|

parent|

prev|

[-]

I think they expect you to also use an LLM to review, and I bet they are doing exactly that when asked to review someone else's code.

by latentsea1 hours ago|

parent|

[-]

That gets you 90% the way there. So, it it only really works if you accept the cruft and the risks associated with that last 10%. Been doing this day in a day out for the last few months and no matter how much and how good we get the automated reviews, we still can't skip the manual ones.

by acedTrex53 minutes ago|

parent|

prev|

[-]

Theres really no diff between a rubber stamp and an llm review, they both do the same thing.

by cmrdporcupine2 hours ago|

parent|

prev|

[-]

> If a coworker dumped a 5k-line code review on you, you'd tell them to come back when it's broken down into smaller, reviewable chunks.

I would, and all my training at Google told me to do that. But what I found after I left that comfortable box was that somehow this kind of practice is acceptable in the industry at large and you're expected to just Deal With It(tm). 5k lines isn't even high by what I've seen.

Worse the "code review" tools that people have access to in GitHub make this absolutely and totally unworkable to incrementally improve review. Messy merge commits full of "responding to code review" comments. Threads impossible to follow. Just bad tooling.

So a lot of shops, from what I've seen, are just yeeting it with very shallow reviews.

This is my observation pre agentic AI. LLMs just threw kerosene on that dumpster fire.

by gavinh58 minutes ago|

prev|

[-]

I agree that reading AI code all day is agonizing. We're relying on code review to develop parts of our mental model of the system that were previously developed through coding. We're having more difficulty comprehending and recall details of the system. This is probably unsurprising; people recall information better that they "generated" than information they read. I am applying some lessons from pedagogy to extend code review. If this resonates with you, I would like to talk.

by mooreds3 hours ago|

prev|

[-]

Are there any products out there that are capturing the prompts/sessions? I imagine you could do it in an adhoc way, asking Claude to write up a summary of the session as part of the commit message. But is there anything else that's more structured/higher level?

by sdesol1 hours ago|

parent|

[-]

I am working on solving the AI Code Provenance problem and I believe my repos may be the first that provides AI code provenance. See the following example:

https://github.com/gitsense/gsc-cli/blob/main/internal/cli/r...

Notice how the code block header attributes the model. The UUID can be traced to the conversation so everybody can tell exactly how the code came about. For this to work though, you need to use my chat app as it ensures you can't tamper with things if you are truly serious about AI code provenance.

I also have a lot more human-focused method which is part of my CLI tool.

https://github.com/gitsense/gsc-cli

I am currently looking at making pi (https://github.com/earendil-works/pi) support AI code provenance, but for now if you want a more structured way to capture what you have done in an agent session that can be used in code reviews and be carried forward as knowledge that lives inside your repository, I have

gsc lessons

The basic idea is, after you have finished chatting/working with the agent, you would work with it to identify lessons worth carrying forward. You can store your session if you want, but really, the lessons should be something that can help you review code better and to prevent future mistakes.

I have a real working example at

https://github.com/gitsense/smart-ripgrep

This is a fork of the BurntSushi/ripgrep repository. It shows how you can use lessons to learn from past design decisions.

by trjordan2 hours ago|

parent|

prev|

[-]

We're working on it, thought it's all early. I'd love feedback: https://tern.sh

First product compares the code to the prompts and highlights places the agent made decisions you weren't involved in: https://tern.sh/docs/tours/

by latentsea1 hours ago|

parent|

prev|

[-]

We just have hook that runs on git push that instructs Claude to ensure the PR description is up to date. Works well enough for us.

by keybored1 hours ago|

prev|

[-]

Flintstone Engineering is applying Space Age synthetic intelligence (in a metaphorical sense) technology with code generation. Babysitting, version controlling, etc. generated code should be a thing of the past. But that is what GenAI is.

At the very least apply it at a higher level: specification, proofs, anything but generating Rust/Java/C and then letting yourself or an agent babysit it.

by agumonkey53 minutes ago|

prev|

[-]

the act, eval, adjust loop is probably neurologically important.. reading about things you didn't dive into is really a dread

depending on your industry, you might be able to ship half-slop and then fix some bugs downstream though