And I'm sure the rewrite is going to teach me a whole different set of lessons...
Not sure why good coverage wouldn't mitigate risk in a refactor...
My mantra whenever I'm working with AI is that I want it to know what "point b" looks like and be able to tell by itself whether it's gotten there...
If you have a working implementation, it sounds like you have a basis for automated tests to be written... once you have that (assuming that the tests are written to test the interface rather than the implementation), then it should be fairly direct to have an agent extract and decompose...
For example, consider a lint rule that bans Kysely queries on certain tables from existing outside of a specific folder. You'd write a rule like this in an effort to pull reads and writes on a certain domain into one place, hoping you can just hand the lint violations to your AI agent and it would split your queries into service calls as needed.
And at first, it will appear to have Just Worked™. You are feeling the AGI. Right up until you start to review the output carefully. Because there are now little discrepancies in the new queries written (like not distinguishing between calls to the primary vs. the replica, missing the point of a certain LIMIT or ORDER BY clause, failing to appropriately rewrite a condition or SELECT, etc.) You run a few more reviewer agent passes over it, but realize your efforts are entirely in vain... because even if the reviewer agent fixes 10 or 20 or 30 of the issues, you can still never fully trust the output.
As someone with experience in doing this kind of thing before AI, I went back to doing it the old way: using a codemod to rewrite the code automatically using a series of rules. AI can write the codemod, AI can help me evaluate the results, but actually having it apply all of the few hundred changes automatically led to a lack of my ability to trust the output. And I suspect that will continue to be true for some time.
This industry needs a "verification layer" that, as far as I know, it does not have yet. Some part of me hopes that someone will reply to this comment with a counterexample, because I could sorely use one.
A really screwed code base blows out your context window and just starts burning tokens as the AI works out a way to kill -9 itself to escape the hell you're subjecting it to.