This metric highly depends on who uses the AI to do what, where strong emphasis is on "who" and "what".
In my line of work (software developer) the biggest time sinks are meetings where people need to align proposed solutions with the expectations of stakeholders. From that aspect AI won't help much, or at all, so measuring the difference of man hours spent from solution proposal to when it ends up in the test loops with and without AI would yield... very disappointing results.
But for troubleshooting and fixing bugs, or actually implementing solutions once they have been approved? For me, I'm at least 10x'ing myself compared to before I was using AI. Not only in pure time, but also in my ability to reason around observed behaviors and investigating what those observations mean when troubleshooting.
But I also work with people who simply cannot make the AI produce valuable (correct) results. I think if you know exactly what you want and how you want it, AI is a great help. You just tell it to do what you would have done anyway, and it does it quicker than you could. But if you don't know exactly what you want, AI will be outright harmful to your progress.
Another thing I don’t see mentioned is code quality.
Vibe-coded code bases are an excellent example of why LLMs aren’t very good at writing code. It will often correct its own mistakes only to make them again immediately after and Inconsistent pattern use.
Recently Claude has been making some “interesting” code style choices, not inline with the code base it’s currently supposed to be working on.
It's got a fun Zelda-inspired mechanic (I won't say which one), and you'll have to unlock abilities and parts of the world over several quests and modes to "win".
It's also multiplayer.
AI, and especially agentic AI can make you lose situational awareness over a codebase and when you're doing deep work that SUUUUCKS, but it's not useless, you just have to play to it's strengths. Though my favorite hill to die on is telling people not to underestimate it's value as autocomplete. Turns out 40 gigabytes of autocomplete makes for a fucking amazing autocomplete. Try it with llama.vim + qwen coder 30b, it feels like the editor is reading your mind sometimes and the latency is so low.