4D Dynamic Scene Reconstruction, Editing, and Generation

4D dynamic scene reconstruction and free-viewpoint rendering offer innovative opportunities for creating immersive experiences, encompassing virtual reality, telepresence, metaverse, and 3D animation production. However, they pose significant challenges in terms of lengthy training time, monocular viewpoints, complicated human environment interactions, high-quality editing and generation. To tackle these challenges, Jiawei’s research studies how to reconstruct, edit, and generate 4D dynamic scenes.

To achieve fast 4D dynamic scene reconstruction, Jiawei proposes the novel fast deformable voxel radiance field to make the dynamic Neural Radiance Fields (NeRF) 100 times faster in modelling dynamic scenes, where he additionally devises a low-cost yet effective static-to-dynamic capture setup for real-world dynamic scenes. To model dynamic scenes with complex human-environment interactions, Jiawei designs a novel 360° free-viewpoint rendering method that reconstructs neural radiance fields for dynamic human-object-scene from a single monocular in-the-wild video. He proposes the novel object bones and state-conditional embeddings to tackle the challenges of complex human object motions and interactions, and he collects a new challenging real-world dataset for evaluation.

Recently, Jiawei introduces the dynamic NeRF as the innovative video representation to edit human-centric videos with large-scale motion and viewpoint
changes. Together with a set of effective 3D space editing designs, this method enables highly consistent human-centric video editing. To generate 4D dynamic scenes, Jiawei contributes to open-source text-to-video generation methods and works towards injecting 3D awareness into generation models.