Extracting Threshold Conceptual Structures from Web Documents
Date of Issue2014
International Conference on Conceptual Structures, ICCS (21st:2014:Iaşi, Romania)
School of Computer Engineering
In this paper we describe an iterative approach based on formal concept analysis to refine the information retrieval process. Based on weights for ranking documents we define a weighted formal context. We use a Galois connection to introduce a new type of formal concept that allows us to work with specific thresholds for searching words in Web documents. By increasing the threshold, we obtain smaller lattices with more relevant concepts, thus improving the retrieval of more specific items. We use techniques for processing large data sets in parallel, to generate sequences of Galois lattices, overcoming the time complexity of building a lattice for an entire large context.
© 2014 Springer International Publishing Switzerland. This is the author created version of a work that has been peer reviewed and accepted for publication by Proceedings of the 21st International Conference on Conceptual Structures, Lecture Notes in Computer Science, Springer. It incorporates referee’s comments but changes resulting from the publishing process, such as copyediting, structural formatting, may not be reflected in this document. The published version is available at: [http://dx.doi.org/10.1007/978-3-319-08389-6_12].