Please use this identifier to cite or link to this item:
https://hdl.handle.net/10356/182732
Title: | Efficient perception methods for autonomous driving via bird's-eye-view representation | Authors: | Li, Yu Xin | Keywords: | Computer and Information Science | Issue Date: | 2025 | Publisher: | Nanyang Technological University | Source: | Li, Y. X. (2025). Efficient perception methods for autonomous driving via bird's-eye-view representation. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/182732 | Abstract: | The advent of autonomous driving technologies marks a significant leap forward in the evolution of transportation systems, promising to enhance vehicle safety, efficiency and navigation capabilities. Central to these advancements is the development of sophisticated perception systems capable of interpreting complex and dynamic environments. Bird’s-Eye-View (BEV) perception, in particular, has emerged as a pivotal technology due to its ability to amalgamate data from multiple sensors into a coherent top-down view of the vehicle’s surroundings. This thesis addresses the critical challenges associated with BEV perception, particularly the computational demands and efficiency of integrating data from diverse sensor modalities in real-world automotive applications. The initial study presented in this thesis, BEVENet, challenges the traditional reliance on Vision Transformers (ViT), which, despite their ability to capture global semantic information, impose significant computational burdens. BEVENet advocates for a convolutional neural network (CNN)-based approach, tailored to enhance computational efficiency without compromising the accuracy and speed required for real-time perception in autonomous vehicles. By redesigning the BEV perception framework to utilize CNNs exclusively, this study achieves substantial reductions in GPU memory usage and computational complexity. The results demonstrate that BEVENet not only matches but often surpasses the performance metrics of state-of-the-art methods, achieving superior inference speeds and reducing computational overhead, thereby making it well-suited for deployment in vehicles with limited computational resources. Building on the groundwork laid by BEVENet, the second part of this thesis, BEVPrumer, advances sensor fusion techniques within the BEV framework. This study introduces a novel approach to data pruning, which strategically processes inputs from multimodal sensors to eliminate redundant data without sacrificing the quality of perception. This content-aware pruning method significantly reduces the computational load by selectively focusing on regions of the environment that are crucial for the perception tasks at hand, thereby optimizing the efficiency of data integration and processing. Experimental results indicate that this approach can reduce model complexity by 35% while maintaining competitive performance with state-of-the-art systems, suggesting a scalable solution for enhancing the compu- tational efficiency of sensor fusion in autonomous vehicles. The third and final study, QuadBEV, explores the integration of multiple perception tasks into a single, cohesive BEV framework. This multitask learning approach addresses the challenge of operational inefficiency in traditional systems by combining tasks such as 3D object detection, lane detection, map segmentation, and occupancy prediction into one unified system. QuadBEV leverages shared spatial and contextual information across these tasks to minimize redundant computations and optimize overall system performance. A tailored training strategy is employed to manage the unique learning rate sensitivities and potential conflicts between different task objectives, facilitating a harmonious integration that enhances the overall efficacy and robustness of the perception system. The framework’s effec- tiveness is validated through extensive testing, confirming its capability to operate effectively in real-world scenarios. In summary, this thesis presents a series of innovative studies that collectively enhance the efficiency, accuracy and practicality of BEV-based perception systems for autonomous driving. Through methodological advancements and rigorous testing, it establishes new benchmarks for the deployment of these technologies in real-world settings, significantly contributing to the field of autonomous vehicle perception and paving the way for future research and development in this critical area of automotive technology. | URI: | https://hdl.handle.net/10356/182732 | DOI: | 10.32657/10356/182732 | Schools: | College of Computing and Data Science | Rights: | This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). | Fulltext Permission: | open | Fulltext Availability: | With Fulltext |
Appears in Collections: | CCDS Theses |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
final_thesis_bev_20250219_pure.pdf | 20.33 MB | Adobe PDF | View/Open |
Page view(s)
187
Updated on May 5, 2025
Download(s) 50
118
Updated on May 5, 2025
Google ScholarTM
Check
Altmetric
Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.