One PR from Claude. The next PR from you fixing Claude’s mistakes.
The security fixes are the worst because the code looks correct. It's not like a typo you'd catch immediately - it's an auth check that works for 95% of cases but fails on edge cases the model never considered.
For catching these we layer a few things:
- Standard SAST (Semgrep, CodeQL) catches the obvious stuff but misses AI-specific patterns - npm audit / pip-audit for dependency issues, especially non-existent packages the AI hallucinates - Custom rules tuned for patterns we keep seeing: overly permissive CORS, missing rate limiting, auth checks that look correct but have subtle logic bugs - Manual review with a specific checklist for AI-generated code (different from our normal review checklist)
You're right that current tooling has a gap. Traditional scanners assume human-written code patterns. AI code looks structurally different - it tends to be more verbose but miss edge cases in ways humans wouldn't. We've been experimenting with scanning approaches specifically tuned for AI output.
The biggest wins have been simple: requiring all AI-generated auth and input validation code to go through a dedicated security reviewer, not just a regular code review.