Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/70507
Full metadata record
DC FieldValueLanguage
dc.contributor.authorNandi, Shuvam
dc.date.accessioned2017-04-26T03:21:20Z
dc.date.available2017-04-26T03:21:20Z
dc.date.issued2017
dc.identifier.urihttp://hdl.handle.net/10356/70507
dc.description.abstractThe decline of Moore’s law has led to a fundamental shift in the design of micro-processor architectures. Devices with parallel processing architectures such as GPUs, FPGAs and DSPs initially used specifically for dedicated tasks are now gaining popularity as accelerators for more general-purpose computations. Performance is exploited in these devices by massively parallelising tasks across various compute units. CUDA and OpenCL are two application programming interface (API) models used to program parallel devices. The long-term objective this project seeks to achieve is the design of hypothetical network of multiple processors, capable of running applications in parallel. OpenCL is used to facilitate comparison of performance being a cross-compatible framework across multiple heterogeneous platforms. Initially, this report examines the performance of numerous computing devices. A simple matrix multiplication kernel was executed with different mappings of the kernel onto the devices. This was followed by profiling a complex application recognising handwritten digits from the MNIST database. Performance in terms of GOPS was computed from the execution timings obtained and by analysing the number of computations performed in the application. The second half of this project investigates free ISAs for implementing a processor as the core unit of the hypothetical engine. RISC-V is picked and studied as it provides several extensions to its base integer instruction set, thereby supporting computationally intensive tasks. An existing processor implementation is examined, followed by developing a new implementation based on RV32IM.en_US
dc.format.extent87 p.en_US
dc.language.isoenen_US
dc.rightsNanyang Technological University
dc.subjectDRNTU::Engineering::Computer science and engineering::Computer systems organization::Processor architecturesen_US
dc.subjectDRNTU::Engineering::Computer science and engineering::Hardware::Register-transfer-level implementationen_US
dc.subjectDRNTU::Engineering::Computer science and engineering::Computing methodologies::Pattern recognitionen_US
dc.subjectDRNTU::Engineering::Computer science and engineering::Computer systems organization::Performance of systemsen_US
dc.titleUnderstanding and profiling a convolutional neural network application on different computing platforms using OpenCLen_US
dc.typeFinal Year Project (FYP)en_US
dc.contributor.supervisorDouglas Leslie Maskellen_US
dc.contributor.schoolSchool of Computer Science and Engineeringen_US
dc.description.degreeBachelor of Engineering (Computer Engineering)en_US
item.grantfulltextrestricted-
item.fulltextWith Fulltext-
Appears in Collections:SCSE Student Reports (FYP/IA/PA/PI)
Files in This Item:
File Description SizeFormat 
Nandi Shuvam - Amended Final Year Project Report.pdf
  Restricted Access
Final Year Project Report - Nandi Shuvam4.51 MBAdobe PDFView/Open

Page view(s)

377
Updated on Apr 26, 2025

Download(s) 50

33
Updated on Apr 26, 2025

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.