Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/44678
Title: Twitter data processing using hadoop
Authors: Khuc, Anh Tuan.
Keywords: DRNTU::Engineering::Computer science and engineering
Issue Date: 2011
Abstract: Twitter has achieved very fast growth rate since first time it is established. It is an useful tool for online users to share their information, thought, interests and activities to their friends. As of 2010, there are 200 million active users on Twitter and Twitter has become a rich source of information which reflects truthfully what are happening and even subjective opinions about these events. The data on Twitter is valuable for researchers, market analysts as well as companies. On the other hand, it is not easy to process and filter out useful information from a huge amount of available data on Twitter.The objective of this project is to develop a fast and scalable application to collect, store and process data on Twitter. The application is written mainly in Java, using Hadoop’s Map-Reduce software framework, which enables itself to run on a cluster of hundreds computers. By the end of the project, the application is able to find Singapore-based users on Twitter, collect their tweets, analyze them and produce some statistics in form of web-pages. Later in this paper, some possibilities of improvement are also discussed.
URI: http://hdl.handle.net/10356/44678
Schools: School of Computer Engineering 
Rights: Nanyang Technological University
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
SCE0098.pdf
  Restricted Access
539.92 kBAdobe PDFView/Open

Page view(s) 50

453
Updated on Oct 9, 2024

Download(s)

17
Updated on Oct 9, 2024

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.