Please use this identifier to cite or link to this item:
Title: Smart object counter
Authors: Kang, Xinhui
Keywords: Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
Issue Date: 2022
Publisher: Nanyang Technological University
Source: Kang, X. (2022). Smart object counter. Final Year Project (FYP), Nanyang Technological University, Singapore.
Project: SCSE21-0244
Abstract: With the world’s urban population drastically increasing during the past decades, the over-crowded city suggests need for effective measures in areas such as crowd control, surveillance, and dynamic traffic planning. This project on object counting focuses on crowd counting and vehicle counting. The use case is also extended to microscopic cell counting task, assisting medical and biological research, and increasing throughput. This project aims to deploy density-estimation-based approach on convoluted neural network to develop efficient, accurate and robust deep learning model for counting task in highly congested scenes. The datasets ShanghaiTech, TRaffic ANd COngestionS (TRANCOS) dataset and P. vivax Malaria dataset are used for training and testing. This project assessed the effectiveness of Inception_v4 and Inception-Resnet-v1 building blocks. The Inception modules consists of different kernel sizes and can extract multi-scale information, and the skip connection in the ResNet design can alleviate the gradient varnishing issue. The combination of the two allows the model to be both wider and deeper, and hence able to recognise more complex features. This report reviews the existing works on object counting, especially the ones that utilised Inception, ResNet and their variants. It is found that the previous works either implemented structures similar to Inception, used older versions of Inception, combined Inception with other networks, and/ or produce inferior results. Main contributions: This project utilised the Inception_v4 and Inception-Resnet-v1 building blocks proposed 3 new models, which are trained, tested and proven to be robust across different use cases in crowd counting, vehicle counting and microscopic cell counting. The models exhibit low error rate and fast convergence and can be trained with limited computational resources. The lowest mean absolute error (MAE) achieved is 7.8 for crowd, 1.5 for vehicle and 2.7 for cell. The model robustness is also tested using scenes outside the training dataset (for instance, Orchard Road pedestrians, NTU North Spine canteen and LTA live traffic). In addition, this report also covers other techniques used: density map generation using Gaussian kernel and the application of novel curriculum loss function. To test the usability of the model trained, a demonstrative web application using Flask is developed to retrieve live LTA traffic photos in every 60 seconds from the API by The number of vehicles is estimated and displayed real-time. It takes 3-4 seconds to generate model output for 1 image, which is more than sufficient for the 60 second interval.
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
  Restricted Access
3.73 MBAdobe PDFView/Open

Page view(s)

Updated on May 19, 2022


Updated on May 19, 2022

Google ScholarTM


Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.