Please use this identifier to cite or link to this item:
https://hdl.handle.net/10356/159102
Title: | Sequence-to-sequence learning for motion prediction and generation | Authors: | Wu, Shuang | Keywords: | Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence |
Issue Date: | 2022 | Publisher: | Nanyang Technological University | Source: | Wu, S. (2022). Sequence-to-sequence learning for motion prediction and generation. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/159102 | Abstract: | The research field for computational understanding and modelling of human motion has garnered increasing importance in the last decade, with a plethora of applications in sports science, animation, robotics, surveillance and autonomous driving. In this thesis, we engage the sequence-to-sequence learning paradigm to study motion prediction and motion generation. We first examine multiple articulated pose representation schemes for integrating biomechanical constraints within computational motion models. Our theoretical analysis and empirical studies suggest that the kinematic tree representation with Stiefel manifold parametrizations is most suitable. In motion prediction, we seek to generate future motion given an observed sequence. To handle long-term dependency, we design a hierarchical recurrent network to simultaneously model local contexts and global characteristics. This attains better short-term accuracy along with natural motion predictions in the long-term. On another front, we look to incorporate control into our prediction models. We employ multiple generative adversarial networks to model individual body parts, allowing for fine-grained control and tuning of the prediction spectrum. Finally, we reconsider motion prediction within the framework of stochastic differential equations, which allows for interpretation of model weights as the stochastic diffusion matrix and drift parameters. For motion generation, we specifically study generating dance motion conditioned on music input. We introduce an optimal transport objective for evaluating the authenticity of generated dance distributions and a Gromov-Wasserstein objective to match dance with music. These objectives allow our model to synthesize realistic dance motion in harmony with the input music. Furthermore, we consider a dual learning framework to concurrently learn both music-to-dance and dance-to-music generation. Effectively integrating the information from both domains, dual learning boosts the performance of individual tasks, delivering realistic genre-consistent dance generations and viable music compositions. | URI: | https://hdl.handle.net/10356/159102 | Schools: | School of Computer Science and Engineering | Organisations: | Bioinformatics Institute, A*STAR | Rights: | This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). | Fulltext Permission: | open | Fulltext Availability: | With Fulltext |
Appears in Collections: | SCSE Theses |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Thesis (WU Shuang).pdf | 51.93 MB | Adobe PDF | ![]() View/Open |
Page view(s)
155
Updated on Sep 26, 2023
Download(s) 50
48
Updated on Sep 26, 2023
Google ScholarTM
Check
Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.