In recent years, the rapid advancement of AI technology has significantly transformed the way people interact with machines. Progress in perceptual models enables machines to better understand what is happening in the real world. As these technologies mature, people are increasingly desiring the creation of complete avatars that can not only accept various types of user input but also combine user instructions with their own personas to produce speech, movements, and interactions with other components in the 3D virtual world.
Mingyuan’s work focuses on providing an automated avatar system in a 3D world. Specifically, he attempts to solve two challenges. First, Multi-Modal Behavior Generation. Users can request the avatar to exhibit corresponding behaviors, including interactions with other avatars and various objects in the environment, by providing information such as videos, text, audio, etc. Second, Personality-based Behavior Generation. Leveraging the capabilities of a large language model, the avatar can autonomously generate behaviors with user-set personality information.