This research investigates the three dimensional (3D) perception of a dynamic scene from multisensory image sequences. The results should greatly enhance the design and operation of intelligent robotic systems (IRS) capable of predicting, detecting and recovering from dynamic failures. The objective is to identify and locate each object present in the work-space of the IRS over a vast dynamic range of 3-D scene conditions. The major problems occur due to: i) self-occlusion of 3-D opaque objects; ii) inherent loss of depth (3-D) information introduced by the video imaging process; iii) inevitability of secondary occlusions when there is more than one object in the scene. Two observations are necessary to counter self-occlusion and compensate the loss of depth information. However, three images are necessary to uniquely solve a similar set of nonlinear equations in 3-D motion analysis of a monocular image sequence. Thus, the IRS must use three or more image sensors from distinctly different vantage (view) points. The span of these vantage point is called the spatial aperture of the vision system. A major bottleneck lies with the registration of images over the spatial aperture. It can be simplified by: i) spatially fusing the temporal (dynamic) information contained in each image sequences; ii) introducing additional constraints using a laser beam when option (ii) is not feasible. A theoretical framework will be developed for bilateral propogation of geometric constraints over the spatial aperture and temporal window to enhance the operational capabilities of the vision system. A hybrid range-intensity image sequence sensor has already been designed to provide a test environment required to experimentally validate the results.