Do we use the same Android Gemini assistant?
Because the one I use does that and it has object detection smart enough to be intuitive. It usually gets it right when I point something on the screen. And when it doesn't, I can circle around the thing or just click again.
This Instagram post for example, it automatically highlighted the entire person, but I wanted to know about the shoes. I then clicked once on the shoes and it knew exactly what I wanted and gave me the info in about 2 seconds:
This is useful to non tech savvy folks. Not just to us hackers.