upvote
You can do it using the more classic technique of photogrammetry. There are commercial products used by real estate salesmen to produce high quality "games" where you walk around inside a house, but they're more like Google Streetview where you swoosh between points where a 360 degree photo was taken. All those things will be more faithful than neurally generating next frames based on previous frames and control input.
reply