Please use this identifier to cite or link to this item:
Title: Distributed machine learning on public clouds
Authors: Lim, Ernest Woon Teng
Keywords: DRNTU::Engineering::Computer science and engineering
Issue Date: 2019
Abstract: Machine learning (ML) aims to construct predictive models from example input data. Conventional ML systems like Caffe could have acceptable model training time on a single machine when dealing with a moderate amount of data. However, they may not be able to cope with very large training data sets, such as ImageNet and Yahoo News Feed, which could have hundreds of millions of records. Several distributed ML systems have been proposed to reduce model training time. However, the behaviors of these systems on heterogeneous infrastructures such as public cloud infrastructures, e.g., Amazon EC2, Google GCE or Windows Azure, have not been thoroughly investigated. In this project, we will examine the performance of popular distributed ML systems such as Distributed Tensorflow and Horovod on Amazon Web Services.
Rights: Nanyang Technological University
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
  Restricted Access
2.93 MBAdobe PDFView/Open

Page view(s) 50

checked on Sep 25, 2020

Download(s) 50

checked on Sep 25, 2020

Google ScholarTM


Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.