Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/178350
Title: A simple and efficient approach to unsupervised instance matching and its application to linked data of power plants
Authors: Eibeck, Andreas
Zhang, Shaocong
Lim, Mei Qi
Kraft, Markus
Keywords: Engineering
Issue Date: 2024
Source: Eibeck, A., Zhang, S., Lim, M. Q. & Kraft, M. (2024). A simple and efficient approach to unsupervised instance matching and its application to linked data of power plants. Journal of Web Semantics, 80, 100815-. https://dx.doi.org/10.1016/j.websem.2024.100815
Project: CREATE 
Journal: Journal of Web Semantics
Abstract: Knowledge graphs store and link semantically annotated data about real-world entities from a variety of domains and on a large scale. The World Avatar is based on a dynamic decentralised knowledge graph and on semantic technologies to realise complex cross-domain scenarios. Accurate computational results for such scenarios require the availability of complete, high-quality data. This work focuses on instance matching — one of the subtasks of automatically populating the knowledge graph with data from a wide spectrum of external sources. Instance matching compares two data sets and seeks to identify instances (data, records) referring to the same real-world entity. We introduce AutoCal, a new instance matcher which does not require labelled data and runs out of the box for a wide range of domains without tuning method-specific parameters. AutoCal achieves results competitive to recently proposed unsupervised matchers from the field of Machine Learning. We also select an unsupervised state-of-the-art matcher from the field of Deep Learning for a thorough comparison. Our results show that neither AutoCal nor the state-of-the-art matcher is superior regarding matching quality while AutoCal has only moderate hardware requirements and runs 2.7 to 60 times faster. In summary, AutoCal is specifically well-suited to be used in an automated environment. We present its prototypical integration into the World Avatar and apply AutoCal to the domain of power plants which is relevant for practical environmental scenarios of the World Avatar.
URI: https://hdl.handle.net/10356/178350
ISSN: 1570-8268
DOI: 10.1016/j.websem.2024.100815
Schools: School of Chemical and Biomedical Engineering 
Organisations: Cambridge Centre for Advanced Research and Education in Singapore
Rights: © 2024 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
Fulltext Permission: open
Fulltext Availability: With Fulltext
Appears in Collections:SCBE Journal Articles

Files in This Item:
File Description SizeFormat 
1-s2.0-S1570826824000015-main.pdf1.61 MBAdobe PDFView/Open

Page view(s)

28
Updated on Jul 12, 2024

Download(s)

3
Updated on Jul 12, 2024

Google ScholarTM

Check

Altmetric


Plumx

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.