Please use this identifier to cite or link to this item:
https://hdl.handle.net/10356/44678
Title: | Twitter data processing using hadoop | Authors: | Khuc, Anh Tuan. | Keywords: | DRNTU::Engineering::Computer science and engineering | Issue Date: | 2011 | Abstract: | Twitter has achieved very fast growth rate since first time it is established. It is an useful tool for online users to share their information, thought, interests and activities to their friends. As of 2010, there are 200 million active users on Twitter and Twitter has become a rich source of information which reflects truthfully what are happening and even subjective opinions about these events. The data on Twitter is valuable for researchers, market analysts as well as companies. On the other hand, it is not easy to process and filter out useful information from a huge amount of available data on Twitter.The objective of this project is to develop a fast and scalable application to collect, store and process data on Twitter. The application is written mainly in Java, using Hadoop’s Map-Reduce software framework, which enables itself to run on a cluster of hundreds computers. By the end of the project, the application is able to find Singapore-based users on Twitter, collect their tweets, analyze them and produce some statistics in form of web-pages. Later in this paper, some possibilities of improvement are also discussed. | URI: | http://hdl.handle.net/10356/44678 | Schools: | School of Computer Engineering | Rights: | Nanyang Technological University | Fulltext Permission: | restricted | Fulltext Availability: | With Fulltext |
Appears in Collections: | SCSE Student Reports (FYP/IA/PA/PI) |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
SCE0098.pdf Restricted Access | 539.92 kB | Adobe PDF | View/Open |
Page view(s) 50
453
Updated on Oct 9, 2024
Download(s)
17
Updated on Oct 9, 2024
Google ScholarTM
Check
Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.