That's relatively easy if you're assuming simple translation and rotation (simple camera movement), as opposed to a squiggle movement or something (e.g. from vibration or being knocked). Because you can simply detect how much sharper the image gets, and hone in on the right values.