Please use this identifier to cite or link to this item:
Full metadata record
DC FieldValueLanguage
dc.contributor.authorGoh, Ming Rui.
dc.description.abstractAs the reliance on computer systems increases, so does complexity of the system and the data size. In order to maintain the efficiency of systems and enhance its scalability, different optimization techniques can be employed. This project looks into the locality of reference of applications, in hope to optimize the performance by administering data within faster speed memory like caches. This project looks into the use of Linux blktrace and blkparse utility, which captures the block input/output traces from different software applications. The analysis is performed on the Hadoop Framework which establishes connection between computer systems to execute tasks in parallel. Preliminary of the analysis dealt with familiarization of the blktrace and blkparse utility. Since the blktrace utility captures all the traces of block input/output that occurred in the system in a specific period, it is essential to filter only those traces relevant to the analysis. In the process of analyzing the data, several different approaches were taken to retrieve and represent the result with increasing accuracy. Due to the inconsistency between a file size and the block input/output read, different file systems were also analyzed to verify this observation. The result show that the current method of filtering the block input/output traces from a specific program included overheads that made the size of the trace larger than the original file size. Analysis on the wordcount function of Hadoop shows that the file access contains the characteristic of spatial locality. Most of each subsequent block access is found to be relatively fast; in the range of 1-4 milliseconds. The analysis on the Database Test Suite– 2 shows that MySQL has a random access behavior on its block I/O accesses.en_US
dc.format.extent91 p.en_US
dc.rightsNanyang Technological University
dc.subjectDRNTU::Engineering::Computer science and engineering::Computer systems organization::Performance of systemsen_US
dc.titleCollecting and analyzing I/O patterns for data intensive applicationsen_US
dc.typeFinal Year Project (FYP)en_US
dc.contributor.schoolSchool of Computer Engineeringen_US
dc.description.degreeBachelor of Engineering (Computer Science)en_US
dc.contributor.supervisor2He Bingshengen_US
item.fulltextWith Fulltext-
Appears in Collections:SCSE Student Reports (FYP/IA/PA/PI)
Files in This Item:
File Description SizeFormat 
  Restricted Access
2.38 MBAdobe PDFView/Open

Page view(s)

Updated on Nov 30, 2020

Download(s) 5

Updated on Nov 30, 2020

Google ScholarTM


Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.