Please use this identifier to cite or link to this item:
Title: Sequence-to-sequence learning for motion prediction and generation
Authors: Wu, Shuang
Keywords: Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Issue Date: 2022
Publisher: Nanyang Technological University
Source: Wu, S. (2022). Sequence-to-sequence learning for motion prediction and generation. Doctoral thesis, Nanyang Technological University, Singapore.
Abstract: The research field for computational understanding and modelling of human motion has garnered increasing importance in the last decade, with a plethora of applications in sports science, animation, robotics, surveillance and autonomous driving. In this thesis, we engage the sequence-to-sequence learning paradigm to study motion prediction and motion generation. We first examine multiple articulated pose representation schemes for integrating biomechanical constraints within computational motion models. Our theoretical analysis and empirical studies suggest that the kinematic tree representation with Stiefel manifold parametrizations is most suitable. In motion prediction, we seek to generate future motion given an observed sequence. To handle long-term dependency, we design a hierarchical recurrent network to simultaneously model local contexts and global characteristics. This attains better short-term accuracy along with natural motion predictions in the long-term. On another front, we look to incorporate control into our prediction models. We employ multiple generative adversarial networks to model individual body parts, allowing for fine-grained control and tuning of the prediction spectrum. Finally, we reconsider motion prediction within the framework of stochastic differential equations, which allows for interpretation of model weights as the stochastic diffusion matrix and drift parameters. For motion generation, we specifically study generating dance motion conditioned on music input. We introduce an optimal transport objective for evaluating the authenticity of generated dance distributions and a Gromov-Wasserstein objective to match dance with music. These objectives allow our model to synthesize realistic dance motion in harmony with the input music. Furthermore, we consider a dual learning framework to concurrently learn both music-to-dance and dance-to-music generation. Effectively integrating the information from both domains, dual learning boosts the performance of individual tasks, delivering realistic genre-consistent dance generations and viable music compositions.
Schools: School of Computer Science and Engineering 
Organisations: Bioinformatics Institute, A*STAR
Rights: This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0).
Fulltext Permission: open
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Theses

Files in This Item:
File Description SizeFormat 
Thesis (WU Shuang).pdf51.93 MBAdobe PDFThumbnail

Page view(s)

Updated on Sep 26, 2023

Download(s) 50

Updated on Sep 26, 2023

Google ScholarTM


Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.