Unsupervised face analysis from multi-view
Seyed Mohammad Hassan Anvar
Date of Issue2014
School of Electrical and Electronic Engineering
Institute for Infocomm Research (I2R)
Face detection, localization and recognition from multi-poses are one of the most challenging topics in the area of computer vision and pattern recognition. During the past decade several methods have been proposed for face detection and recognition but they mainly concentrate on frontal faces. Although in recent years, some methods have been proposed based on sliding windows to exhaustively search through the image in many scales and views, they are slow and their performance on multi-view or multi-pose faces are not as good as frontal face detections. Training these methods is also a bottleneck. Most of the proposed methods require thousands of positive and negative samples from different face poses for training. These images for training have to be manually cropped, aligned and labeled, thus is manually intensive and expensive to perform. Note that since the effect of variation in the face pose captured on a fix camera and change in camera view but with a fixed face pose is the same, we used both words interchangeably, In this research project, a probabilistic approach is proposed for face detection, localization and identification from multi-views. The proposed method does not supervision during the training stage. The main focus of the project is on multi-view analysis as well as unsupervised or automated learning. The proposed approach provides a unified framework to learn a multi-view model for face detection, localization and identification. For face detection and localization from multi poses, given the images of different people in multi-poses, it obtains a multi-view face model obtained through a constellation of corresponding face features in the training set. The obtained model is pruned such that only the most distinctive features are retained. The model is then used to detect and localize multiple face images in multi-poses. Two versions of model construction are proposed for face detection and localization. One requires manually labeling only two control points in one image of the training set regardless of the number of images in the training set. The other version is completely automatic. Even the control points are estimated automatically and the system only requires some training images that include face of different people from multi-views. The trade-off is that the accuracy is slightly lower compared to the manually labeled approach. Another effort is face identification and localization from multi poses. The proposed method is completely automatic. The data collection is also performed by the system using web images obtained from web search. It requires only the textual name of the queried person, which is used as input to the search engine. Using the images returned by the search engine, the system classifies these images and constructs a multi-view face model for the query. With the model, the system is then able to identify the query candidate in his/her digital photo album or gallery. The user does not have to manually tag the images or provide images to train the face recognition module. Since system is completely automatic, it has widely application including search for people in news video or tagging actors and actresses in movies or targeted advertisement for interactive televisions (IPTV).
DRNTU::Engineering::Electrical and electronic engineering::Electronic systems::Signal processing