Online learning for search and classification
Nguyen, Thanh Tam
Date of Issue2013
School of Computer Engineering
Online learning is a common and useful tool for machine learning and data mining. In contrast to batch learning, online learning receives a sequence of training instances and uses some of them at a time. By the nature of online learning, the training instances may be processed only once. Therefore online learning algorithms can work on big data beyond the memory or disk capacity as well as streaming data. Moreover in document classification, online linear learning has been shown to be much more efficient than non-linear learning in terms of training and testing time. Therefore, online linear learning has recently become an active research topic. This thesis proposes a research framework that attempts to solve the search and classification problems based on the online linear learning approaches. Specifically, we have proposed online learning classification algorithms that are able to work on multiple view datasets and an online learning-to-rank algorithm that improves the accuracy of a search engine. The main research contributions are listed as follows. (i) Feature selection: we have investigated a number of newly supervised term weighting methods to improve the performance of text classification; (ii) Online classification: we have proposed several online learning algorithms that can be used for topic classification; (iii) Two-view online learning: we have proposed a two-view online learning algorithm, which can work on two-view datasets; (iv) Online learning-to-rank: for search engine, we have proposed an online learning-to-rank algorithm, which was to learn a scoring function to re-rank the search result.
DRNTU::Engineering::Computer science and engineering::Computing methodologies::Document and text processing