It just feels like many google products really, they are capable of really amazing things, it's just that nobody there seem to care. I would guess they are likely optimizing more for internal use than their vast userbase.
So I put together a plan for refactoring it, step by step, with tests, etc. After literally 8 solid days of fighting with Gemini 3 Pro, I still couldn't pull it off.
I gave GPT 5.5 a chance with the same prompt, plans, and repo. I'm not sure how long it took, but when I checked in on it a few hours later it was done. All tests passed, everything exactly how I'd asked, and better (it made some improvements).