Please use this identifier to cite or link to this item:
https://hdl.handle.net/10356/182916
Title: | Advancing 3D scene understanding through discriminative and generative learning approaches | Authors: | Tang, Zhe Jun | Keywords: | Computer and Information Science | Issue Date: | 2025 | Publisher: | Nanyang Technological University | Source: | Tang, Z. J. (2025). Advancing 3D scene understanding through discriminative and generative learning approaches. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/182916 | Abstract: | This thesis explores the crucial role of Computer Vision in endowing computers with general intelligence, focusing on developing algorithms that enable machines to perceive and understand their three-dimensional surroundings. The research is divided into two parts: discriminative and generative learning approaches, with three core chapters formulating 3D scene understanding. From a discriminative learning perspective, a novel approach to point cloud segmentation is devised, which is crucial for road scene perception. The proposed method processes point clouds as a whole while retaining local information, achieving high accuracy in segmenting objects from scenes despite the computational challenges of processing large input data. The generative learning approach focuses on generating entire 3D scenes from 2D images. Prior art methods in rendering 3D scenes via volumetric rendering are studied, and an end-to-end learning approach with transformers is proposed as an alternative to physics-based approaches. Novel methods to capture lighting information of scenes, inspired by modern game engines, are devised to improve rendering quality. Further investigation into new rendering methods with rasterisation of 3D Gaussian spheres is conducted, along with a different method for capturing lighting information to enhance rendering quality. The research contributes to the overarching goal of helping computers perceive and interact with the 3D world, offering numerous advantages for downstream applications such as autonomous vehicles, augmented reality, and virtual collaboration. | URI: | https://hdl.handle.net/10356/182916 | DOI: | 10.32657/10356/182916 | Schools: | College of Computing and Data Science | Rights: | This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). | Fulltext Permission: | open | Fulltext Availability: | With Fulltext |
Appears in Collections: | CCDS Theses |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
ZJTang_PhdThesis-DRNTU-2.pdf | 47.09 MB | Adobe PDF | View/Open |
Page view(s)
60
Updated on May 7, 2025
Download(s) 50
25
Updated on May 7, 2025
Google ScholarTM
Check
Altmetric
Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.