On discovering concept entities from web sites

DSpace/Manakin Repository


Search DR-NTU

Advanced Search Subject Search


My Account

On discovering concept entities from web sites

Show simple item record

dc.contributor.author Yin, Ming
dc.contributor.author Goh, Dion Hoe-Lian
dc.contributor.author Lim, Ee Peng
dc.date.accessioned 2009-10-02T01:34:21Z
dc.date.available 2009-10-02T01:34:21Z
dc.date.copyright 2005
dc.date.issued 2009-10-02T01:34:21Z
dc.identifier.citation Yin, M., Goh, D., & Lim, E. P. (2005). On discovering concept entities from web sites. Proceedings of the International Conference on Computational Science and its Applications 2005 ICCSA 2005, (May 9-12, Singapore), Lecture Notes in Computer Science 3481, 1177- 1186.
dc.identifier.uri http://hdl.handle.net/10220/6122
dc.description.abstract A web site usually contains a large number of concept entities, each consisting of one or more web pages connected by hyperlinks. In order to discover these concept entities for more expressive web site queries and other applications, the web unit mining problem has been proposed. Web unit mining aims to determine web pages that constitute a concept entity and classify concept entities into categories. Nevertheless, the performance of an existing web unit mining algorithm, iWUM, suffers as it may create more than one web unit (incomplete web units) from a single concept entity. This paper presents a new web unit mining algorithm, kWUM, which incorporates site-specific knowledge to discover and handle incomplete web units by merging them together and assigning correct labels. Experiments show that the overall accuracy has been significantly improved.
dc.format.extent 12 p.
dc.language.iso en
dc.rights The original publication is available at www.springerlink.com.
dc.subject DRNTU::Engineering::Computer science and engineering::Computer systems organization::Computer-communication networks
dc.title On discovering concept entities from web sites
dc.type Conference Paper
dc.contributor.conference International Conference on Computational Science and its Applications (5th : 2005 : Singapore)
dc.contributor.school Wee Kim Wee School of Communication and Information
dc.identifier.doi http://dx.doi.org/10.1007/11424826_125
dc.description.version Accepted version

Files in this item

Files Size Format View
2005-concept-entities-iccsa.pdf 265.4Kb PDF View/Open

This item appears in the following Collection(s)

Show simple item record


Total views

All Items Views
On discovering concept entities from web sites 403

Total downloads

All Bitstreams Views
2005-concept-entities-iccsa.pdf 277

Top country downloads

Country Code Views
China 119
United States of America 76
Singapore 38
Russian Federation 10
France 7

Top city downloads

city Views
Beijing 84
Singapore 38
Mountain View 34
Bellevue 7
Redmond 5