Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/184283
Title: Deep learning based prediction and planning for autonomous driving
Authors: Huang, Qihang
Keywords: Computer and Information Science
Issue Date: 2025
Publisher: Nanyang Technological University
Source: Huang, Q. (2025). Deep learning based prediction and planning for autonomous driving. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/184283
Abstract: We propose TrajDiffusion, a diffusion-based decoder model designed for trajectory prediction tasks involving single or multiple agents. Our decoder module has several distinctive features. First, as the input to the diffusion process relies on 20 anchor points sampled from real-world data, our model effectively captures diverse multimodal predictions. Second, by incorporating the DPM-Solver method into the diffusion model, we achieve high-quality predictions within just four inference steps. This significantly alleviates real-time constraints commonly faced when applying diffusion models to trajectory prediction tasks. Third, our approach is flexible, being applicable to both single-agent and multi-agent prediction scenarios. Additionally, we introduce a compact trajectory representation via Whitened Principal Component Analysis (PCA). Unlike standard PCA, which only removes correlations among components, Whitened PCA also normalizes each component to unit variance. This adjustment makes the trajectory representation better aligned with the isotropic Gaussian noise used in diffusion models, reducing the mismatch between the noise distribution and the data distribution. Consequently, it enhances the effectiveness of the denoising process and improves overall prediction performance. Finally, we present an optional constrained sampling function, enabling controlled trajectory generation. By computing differentiable cost functions based on generated trajectories and incorporating rules or physical priors, we guide trajectories toward more physically plausible outcomes. Our proposed TrajDiffusion framework can be readily integrated with existing encoder architectures, resulting in diverse outputs, fast inference speed, comprehensive utilization of encoder features, and excellent performance on the Argoverse 2 dataset.
URI: https://hdl.handle.net/10356/184283
Schools: School of Electrical and Electronic Engineering 
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:EEE Theses

Files in This Item:
File Description SizeFormat 
Huang Qi Hang-Dissertation.pdf
  Restricted Access
1.68 MBAdobe PDFView/Open

Page view(s)

23
Updated on May 7, 2025

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.