3D vision is the process of constructing a scene from images obtained from multiple cameras or a single camera that is moving throughout the scene. However, current 3D vision suffers some problems due to three core issues: the emphasis on points rather than on curves and surfaces; sub-optimal integration of information from multiple images, and the need for richer semantic geometrical structuring of the scene. This project aims to rectify all three shortcomings by developing new technologies which work with curves and surfaces, integrate information from multiple cameras simultaneously, and use semantic primitives to describes objects in the scene. These developments will complement and present an invaluable addition to the plethora of existing techniques for 3D pose estimation and reconstruction, especially in textureless scenes such as man-made environments. Applications of this research include robotics, entertainment industry, archaeology, architecture, urban modeling, and metrology.
This project develops an end-to-end technology for pose estimation and scene reconstruction using differential geometry of curves and surfaces. First, it develops the technology necessary to use point-tangents as opposed to just points in the pose estimation process, transformative in reducing the number of necessary correspondences using information that is already available. It will also identify the minimal problems in this area. Second, the approach would allow for the integration of information from many cameras using intuitively defined geometric equations that can easily extend beyond three cameras, important in applications like visual odometry. Third, it develops a technology for surface reconstruction from a 3D curve graph acting like scaffold, while also taking into account both unorganized point reconstructions in textured areas and a novel set of differential photometric constraints in textureless areas. Fourth, it integrates the image-based grouping process with a simultaneous 3D reconstruction process resulting in a mid-level representation as a topologically connected set of curve fragments and surface fragments, which better matches the requirement of semantic and functional tasks.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.