dc.contributor.authorLi, Chenliang
dc.contributor.authorSun, Aixin
dc.contributor.authorDatta, Anwitaman
dc.date.accessioned2013-07-25T08:19:36Z
dc.date.available2013-07-25T08:19:36Z
dc.date.copyright2012en_US
dc.date.issued2012
dc.identifier.citationLi, C., Sun, A., & Datta, A. (2012). Twevent: Segment-based event detection from tweets. Proceedings of the 21st ACM international conference on Information and knowledge management.en_US
dc.identifier.urihttp://hdl.handle.net/10220/12305
dc.description.abstractEvent detection from tweets is an important task to understand the current events/topics attracting a large number of common users. However, the unique characteristics of tweets (e.g. short and noisy content, diverse and fast changing topics, and large data volume) make event detection a challenging task. Most existing techniques proposed for well written documents (e.g. news articles) cannot be directly adopted. In this paper, we propose a segment-based event detection system for tweets, called Twevent. Twevent first detects bursty tweet segments as event segments and then clusters the event segments into events considering both their frequency distribution and content similarity. More specifically, each tweet is split into non-overlapping segments (i.e. phrases possibly refer to named entities or semantically meaningful information units). The bursty segments are identified within a fixed time window based on their frequency patterns, and each bursty segment is described by the set of tweets containing the segment published within that time window. The similarity between a pair of bursty segments is computed using their associated tweets. After clustering bursty segments into candidate events, Wikipedia is exploited to identify the realistic events and to derive the most newsworthy segments to describe the identified events. We evaluate Twevent and compare it with the state-of-the-art method using 4.3 million tweets published by Singapore-based users in June 2010. In our experiments, Twevent outperforms the state-of-the-art method by a large margin in terms of both precision and recall. More importantly, the events detected by Twevent can be easily interpreted with little background knowledge because of the newsworthy segments. We also show that Twevent is efficient and scalable, leading to a desirable solution for event detection from tweets.en_US
dc.language.isoenen_US
dc.rights© 2012 ACM.en_US
dc.subjectDRNTU::Engineering::Computer science and engineering
dc.titleTwevent : segment-based event detection from tweetsen_US
dc.typeConference Paper
dc.contributor.conferenceInternational conference on Information and knowledge management (21st : 2012 : Maui, USA)en_US
dc.contributor.schoolSchool of Computer Engineeringen_US
dc.identifier.doihttp://dx.doi.org/10.1145/2396761.2396785


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record