Google Engineers Launch "Sashiko" for Agentic AI Code Review of the Linux Kernel

(www.phoronix.com)

75 points

by speckx4 hours ago |

29 comments

by rwmj3 hours ago|

[-]

Better to link to the site itself, or one of the reviews?

For an example of a review (picked pretty much at random) see: https://sashiko.dev/#/patchset/20260318151256.2590375-1-andr...

The original patch series corresponding to that is: https://lkml.org/lkml/2026/3/18/1600

Edit: Here's a simpler and better example of a review: https://sashiko.dev/#/patchset/20260318110848.2779003-1-liju...

I'm very glad they're not spamming the mailing list.

by jeffbee2 hours ago|

parent|

[-]

That is both really useful and a great example of why they should have stopped writing code in C decades ago. So many kernel bugs have arisen from people adding early returns without thinking about the cleanup functions, a problem that many other language platforms handle automatically on scope exit.

by overfeed2 hours ago|

parent|

[-]

Must we do this on every thread about the Linux kernel?

by RobRivera17 minutes ago|

parent|

[-]

The beatings will continue until morale improves

by richwater1 hours ago|

parent|

prev|

[-]

[flagged]

by nurettin3 minutes ago|

parent|

prev|

[-]

> stopped writing code in C decades ago.

And what were they supposed to use in 2006? Free Pascal? Ada?

by tigen59 minutes ago|

parent|

prev|

[-]

This ought to help with that. https://thephd.dev/c2y-the-defer-technical-specification-its...

by withinrafael1 hours ago|

prev|

[-]

Looks cool, but this site is a bit difficult for me to grok.

I think the table might be slightly inside-out? The Status column appears to show internal pipeline states ("Pending", "In Review") that really only matter to the system, while Findings are buried in the column on the far right. For example, one reviewed patchset with a critical and a high finding is just causally hanging out below the fold. I couldn't immediately find a way to filter or search for severe findings.

It might help to separate unreviewed patches from reviewed ones, and somehow wire the findings into the visual hierarchy better. Or perhaps I'm just off base and this is targeting a very specific Linux kernel community workflow/mindset.

Just my 1c.

by tonfa1 hours ago|

parent|

[-]

I think it's just a dashboard, not meant to be used as is.

Reviewers are more likely to instead subscribe to get the review inline, and then potentially incorporate that with their feedback.

by fdghrtbrt46 minutes ago|

parent|

prev|

[-]

> difficult for me to grok

You sound like a troglodyte.

by kleiba46 minutes ago|

prev|

[-]

> Sashiko was able to find around 53% of bugs

That's cool. Another interesting metric, however, would be the false positive ratio: like, I could just build a bogus system that simply marks everything as a bug and then claim "my system found 100% of all bugs!"

In practice, not just the recall of a bug finding system is important but also its precision: if human reviewers get spammed with piles of alleged bug reports by something like Sashiko, most of which turn out not to be bugs at all, that noise binds resources and could undermine trust in the usefulness of the system.

by monksy3 hours ago|

prev|

[-]

I think this is a great and interesting project. However, I hope that they're not doing this to submit patches to the kernel. It would be much better to layer in additional tests to exploit bugs and defects for verification of existance/fixes.

(Also tests can be focused per defect.. which prevents overload)

From some of the changes I'm seeing: This looks like it's doing style and structure changes, which for a codebase this size is going to add drag to existing development. (I'm supportive of cleanups.. but done on an automated basis is a bad idea)

I.e. https://sashiko.dev/#/message/20260318170604.10254-1-erdemhu...

by rwmj3 hours ago|

parent|

[-]

No, it's reviewing patches posted on LKML and offering suggestions. The original patch posted corresponding to your link was this, which was (presumably!) written by a human:

https://lkml.org/lkml/2026/3/9/1631

by bjackman3 hours ago|

parent|

prev|

[-]

Style and structure is not the goal here, the reason people are interested in it is to find bugs.

Having said that, if it can save maintainers time it could be useful. It's worth slowing contribution down if it lets maintainers get more reviews done, since the kernel is bottlenecked much more on maintainer time than on contributor energy.

My experience with using the prototype is that it very rarely comments with "opinions" it only identifies functional issues. So when you get false positives it's usually of the form "the model doesn't understand the code" or "the model doesn't understand the context" rather than "I'm getting spammed with pointless advice about C programming preferences". This may be a subsystem-specific thing, as different areas of the codebase have different prompts. (May also be that my coding style happens to align with its "preferences").

by mika-el17 minutes ago|

prev|

[-]

the separation between who writes and who reviews is the whole thing. I do same at smaller scale — one model writes code, different model reviews it. self-review misses things, same reason you don't review your own PRs

by ChrisArchitect2 hours ago|

prev|

[-]

https://github.com/sashiko-dev/sashiko (https://news.ycombinator.com/item?id=47427996)

by takahitoyoneda2 hours ago|

prev|

[-]

[dead]

by Heer_J3 hours ago|

prev|

[-]

[dead]

by ratrace2 hours ago|

prev|

[-]

[dead]

by 4fterd4rk3 hours ago|

prev|

[-]

oh god can we not

by smlacy3 hours ago|

parent|

[-]

What's your concern?

by htx80nerd3 hours ago|

parent|

[-]

Have you ever programmed with AI? It needs a lot of hand holding for even simple things sometimes. Forgets basic input, does all kinds of brain dead stuff it should know not to do.

>"good catch - thanks for pointing that out"

by lame-robot-hoax3 hours ago|

parent|

[-]

Can you clarify how, at all, that’s relevant to the article?

by ablob2 hours ago|

parent|

[-]

Both the curl and the SQLite project have been overburdened by AI bug reports. Unless the Google engineers take great care to review each potential bug for validity the same fate might apply here. There have been a lot of news regarding open source projects being stuffed to the brim with low effort and high cost merge requests or issues. You just don't see all the work that is caused unless you have to deal with the fallout...

by tonfa1 hours ago|

parent|

[-]

This project has nothing to do with bug reports... it's an opt-in tool for reviewing proposed changes that kernel developers can decide to use (if they find it useful).

by jamesnorden2 hours ago|

parent|

prev|

[-]

Well, if it doesn't find anything it's just a waste of time at best.

by danielbln10 minutes ago|

parent|

[-]

Prevention paradox.

by asadm3 hours ago|

parent|

prev|

[-]

i think it's a skill.

by __tidu3 hours ago|

parent|

prev|

[-]

well tbf code review is probably the most useful part of "AI coding", if it catches even a single bug you missed its worth it, plus false positives would waste dev time but not pollute the kernel

by quantium16282 hours ago|

prev|

[-]

b2b or b2c? feels like it could go either way

by shevy-java2 hours ago|

prev|

[-]

Now they want to kill the Linux kernel. :(

We've already seen how bug bounty projects were closed by AI spam; I think it was curl? Or some other project I don't remember right now.

I think AI tools should be required, by law, to verify that what they report is actually a true bug rather than some hypothetical, hallucinated context-dependent not-quite-a-real-bug bug.

by tonfa1 hours ago|

parent|

[-]

It's not forced upon anyone, it's a tool that patch authors or reviewers can use if they want to.

by qainsights1 hours ago|

prev|

[-]

They would have completely redesigned Google Gerrit.