A hardware accelerated object recognition system using event-based image sensor
Date of Issue2015
School of Electrical and Electronic Engineering
The goal of this research is to explore the design and implementation of a hardware accelerated highly efficient engine for the categorization of objects, using asynchronous event-based image sensor. The image sensor, namely Dynamic Vision Sensor (DVS), is equipped with temporal difference processing hardware. It outputs data in the format of binary event stream, in which 1 stands for a pixel on a motion object and 0 represents a still background pixel. The proposed system fully utilizes the precise timing information in the output of DVS. The asynchronous nature of this system frees computation and communication from the adamant clock timing in typical systems. Neuromorphic processing is used to extract cortex-like spike-based features through an event-driven MAX-like convolution network. Real-time light-weighted object recognition system is found more and more important, in particular in a number of emerging applications, such as in Unmanned Aerial Vehicle (UAV), where a recognition system that can detect and avoid obstacle; e-health applications such as human activity categorization; automobile application such as active car collision avoidance system, to name a few. A lot of research work has been focused on this before, mainly using traditional picture-based images sensor running at a high frame rate. Approaches like background subtraction can provide foreground object skeleton, followed by various techniques to represent the features in the image and at last, utilizing statics regression and classification algorithms, raw recognition results can be obtained. However, due to algorithm complexity and the large quantity of frame based image data, these algorithms have to be carried out on super powerful computers. This limits the application to wider usage, not to mention its high power consumption, mass and volume. In addition, conventional frame image sensor contains tons of data redundancy for image processing. Usually the output data form can be represented as 3 layer matrix. The colour information and the redundant background lag the later processing speed in a great deal. Actually, the very first of many image processing algorithm is trying to remove those background as much as possible. For example, the Phantom Miro 3 high speed camera from Vision Research  with a resolution of 1920×1080 and a frame rate of 2570 will produce 5.3 GB data per second and consume around 12W power. While the second generation DVS with 2048×2048 resolution will only output 10 MB data per second. Different from the conventional frame based cameras, event based silicon retina, or so called DVS, is able to perform temporal difference in hardware within the sensor chip to reduce output data. In this thesis we proposed a seamless combination of a bio-inspired, event driven object recognition system. The system is tailored for frame-free asynchronous Address Event Representation (AER) vision sensor to receive and process the absolute binary event stream. For each event, in the form of (address, time), is sent to a batch of Difference of Gaussian (DoG) filters and convoluted, in parallel. Modern neuron model, leaky and integration models are used to model dynamic membrane potential and fire status. Each neuron competes with others within its receptive field. The extracted spike patterns are then classified by an online sequential extreme learning machine with lookup table. Using a lookup table, the system can be made virtually fully connected by physically activating only a very small subset of the classification network. Experiment results show that the proposed system has a very fast training speed while still maintaining a competitive level of accuracy. The main contributions of this thesis can be summarized as three aspects: (1) Fully event driven, biology inspired, hardware friendly classification system; (2) The seamless integration of online sequential learning to an AER classification system; (3) The use of a lookup table to achieve a virtually fully connected system by physically activating a very small subset of the classification network.
DRNTU::Engineering::Electrical and electronic engineering::Integrated circuits