Please use this identifier to cite or link to this item:
https://hdl.handle.net/10356/184239
Title: | Advancements in zero-shot learning for skeleton-based action recognition | Authors: | Peng, Han | Keywords: | Computer and Information Science | Issue Date: | 2025 | Publisher: | Nanyang Technological University | Source: | Peng, H. (2025). Advancements in zero-shot learning for skeleton-based action recognition. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/184239 | Project: | ISM-DISS-04792 | Abstract: | This paper presents an advanced approach to Zero-Shot Learning (ZSL) for skeleton-based action recognition, aiming to improve the model's ability to recognize unseen actions without extensive labeled training data. Skeleton-based action recognition has gained significant attention due to its robustness to environmental variations and compact data representation. However, traditional methods often struggle with generalization to new, unseen actions. To address this challenge, we propose a novel framework that integrates semantic embeddings with skeleton features through a disentangled latent space. Our method leverages a fine-grained formulation strategy to partition skeleton features into body parts and generates detailed text descriptions for each part using a Large Language Model (LLM). These descriptions are then aligned with skeleton features through a cross-modal alignment module, which employs Variational Autoencoders (VAEs) and adversarial training to disentangle semantic-related and semantic-irrelevant components. We further design a multi-stream classifier to effectively utilize both global and part-level features for action recognition. Extensive experiments on the NTU RGB+D and NTU RGB+D 120 datasets shows that our approach surpasses state-of-the-art methods in both 2-stream and multi-stream ZSL classification settings. Our results highlight the importance of fine-grained feature partition and semantic alignment in improving the generalization capability of skeleton-based action recognition models. | URI: | https://hdl.handle.net/10356/184239 | Schools: | School of Electrical and Electronic Engineering | Fulltext Permission: | restricted | Fulltext Availability: | With Fulltext |
Appears in Collections: | EEE Theses |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Peng Han-Dissertation_signed.pdf Restricted Access | 5.53 MB | Adobe PDF | View/Open |
Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.