The upcoming deployment of 5G technology makes it feasible to communicate with extremely low latency the vast amounts of data needed for Augmented Reality (AR) and Mixed Reality (MR), which enables transformative applications such as 3D tele-immersive (or 3D facetime) communication, mobile AR apps using captured 4D human contents, AI assistants of lifelike and personalized avatars, human-aware robots that can serve or work with humans, tele-rehabilitation to connect physical therapists with wounded patients far from treatment facilities, etc. All these examples of AR and MR applications require the development of real-time 4D (space and time) capture and reconstruction of dynamic scenes involving human bodies, faces, body add-ons (like clothes), and their surrounding environments. Despite prior research on real-time 4D reconstruction, there is still no open-source and robust reconstruction system that can model topological changes of the dynamic scenes and track the moving surfaces with accuracy and robustness. This project is designed to bridge such gaps, to develop an open-source and robust 4D reconstruction framework that can benefit researchers and developers in broader scientific communities as well as industry alliances, including 5G medical standards, AR and MR game engines, lifelike AI assistants, human-aware robotics, tele-rehabilitation, etc. This project also provides curriculum development and educational activities for graduate, undergraduate, and K-12 students through summer camps.

The research proposed for this project centers around the elegant modeling of topological changes in dynamic scenes and the robust tracking of moving surfaces, towards the ultimate goal of developing an open-source robust 4D reconstruction framework for real-time capture of dynamic human scenes. To solve the challenges of topological changes, the volumetric fusion framework and its data structures will be fundamentally redesigned, by introducing Non-manifold Volumetric Grids into both Truncated Signed Distance Field (TSDF) and Embedded Deformation Graph (EDG) representations, allowing both the volumetric cells to replicate themselves and the edges to be broken. Such a novel topology-change-aware framework will allow the reconstructed mesh geometry to update its connectivity on-the-fly, along with a flexible deformation graph updating its connectivity between nodes throughout the 4D capture process. To solve the robust surface tracking problem in 4D dynamic human capture, a Parameterized Animatable Volumetric Model (PAVM) is proposed to combine the benefits of both the parametric human body model and the volumetric TSDF. The TSDF volumetric grids are built on top of the parameterized human body surfaces, so that they can be used to represent the add-ons (e.g. clothes) to the human body. The regular volumetric structure of our PAVM makes it easy to integrate into deep neural networks for robust surface tracking, as well as providing semantic modeling capability. The technical feasibility of the 4D reconstruction framework will be validated by the development of a Mobile 3D Facetime testbed, which will allow people in remote places to interact with each other in a natural AR fashion.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Advanced CyberInfrastructure (ACI)
Type
Standard Grant (Standard)
Application #
2007661
Program Officer
Alan Sussman
Project Start
Project End
Budget Start
2020-09-01
Budget End
2023-08-31
Support Year
Fiscal Year
2020
Total Cost
$499,997
Indirect Cost
Name
University of Texas at Dallas
Department
Type
DUNS #
City
Richardson
State
TX
Country
United States
Zip Code
75080