> Decode the mono video and use the disparity map to interpolate the view of either eye.
By "disparity map" are you thinking something like a heightmap applied to the scene facing the viewer and then you use that to skew things for each eye?
If so, how would that handle parts of the scene that are occluded/revealed to one eye but not the other?
How does video encoding like H.264 handle parts of a scene that are occluded in one frame, but not occluded in the next frame?
A three inch difference between two cameras producing simultaneous frames is similar to a three inch sideways step of one camera in time between two frames.
True, occlusions would be a problem but we’re taking about fake autostereoscopic 3D here, where most of the stereo rigs used for capture have but a modest baseline. Almost all of the depth perception comes from disparity, occlusions would still in fact be very visible by the averaging method I described and be at whatever depth plane of the occluder which is probably a good guess anyway. Not like your other eye would receive a correspondence from an occlusion in the real world.
FYI, there's online software[1] to recreate 3D/stereoscopic 3D imagery from the depth-enabled photos taken e.g. by Moto G5S (which has a dual-camera setup that computes the depth map, but no API to extract/store the image taken by the other camera).
My personal opinion is that true stereoscopic images feel better when there's enough detail; those occlusions do matter. For some imagery it doesn't matter as much though.
By "disparity map" are you thinking something like a heightmap applied to the scene facing the viewer and then you use that to skew things for each eye?
If so, how would that handle parts of the scene that are occluded/revealed to one eye but not the other?