Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/97953
Title: Twevent : segment-based event detection from tweets
Authors: Li, Chenliang
Sun, Aixin
Datta, Anwitaman
Keywords: DRNTU::Engineering::Computer science and engineering
Issue Date: 2012
Source: Li, C., Sun, A., & Datta, A. (2012). Twevent: Segment-based event detection from tweets. Proceedings of the 21st ACM international conference on Information and knowledge management.
Abstract: Event detection from tweets is an important task to understand the current events/topics attracting a large number of common users. However, the unique characteristics of tweets (e.g. short and noisy content, diverse and fast changing topics, and large data volume) make event detection a challenging task. Most existing techniques proposed for well written documents (e.g. news articles) cannot be directly adopted. In this paper, we propose a segment-based event detection system for tweets, called Twevent. Twevent first detects bursty tweet segments as event segments and then clusters the event segments into events considering both their frequency distribution and content similarity. More specifically, each tweet is split into non-overlapping segments (i.e. phrases possibly refer to named entities or semantically meaningful information units). The bursty segments are identified within a fixed time window based on their frequency patterns, and each bursty segment is described by the set of tweets containing the segment published within that time window. The similarity between a pair of bursty segments is computed using their associated tweets. After clustering bursty segments into candidate events, Wikipedia is exploited to identify the realistic events and to derive the most newsworthy segments to describe the identified events. We evaluate Twevent and compare it with the state-of-the-art method using 4.3 million tweets published by Singapore-based users in June 2010. In our experiments, Twevent outperforms the state-of-the-art method by a large margin in terms of both precision and recall. More importantly, the events detected by Twevent can be easily interpreted with little background knowledge because of the newsworthy segments. We also show that Twevent is efficient and scalable, leading to a desirable solution for event detection from tweets.
URI: https://hdl.handle.net/10356/97953
http://hdl.handle.net/10220/12305
DOI: http://dx.doi.org/10.1145/2396761.2396785
Rights: © 2012 ACM.
metadata.item.grantfulltext: none
metadata.item.fulltext: No Fulltext
Appears in Collections:SCSE Conference Papers

Page view(s)

413
checked on Dec 24, 2019

Google ScholarTM

Check

Altmetric

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.