Please use this identifier to cite or link to this item:
Title: Optimization for efficient data communication in distributed machine training system
Authors: Gan, Hsien Yan
Keywords: DRNTU::Engineering::Computer science and engineering
Issue Date: 2017
Abstract: The rising trend of deep learning causes the complexity and scale of machine learning to increase exponentially. But, the complexity is limited by hardware processing speed. To solve the issue, there are a few machine learning frameworks online, which support distributed training on multiple nodes. Compared to interprocess communication, data exchange between nodes is relatively slow, high latency and high overhead cost. When the network link is shared among multiple nodes, limited bandwidth arises, which is a more undesirable property. This project is to minimize the data flow between nodes by adding a data filter and Snappy compression. The filter reduces the unnecessary data flow while the Snappy does data compression to reduce bandwidth consumption. This implementation successfully reduces the data flow to 8 percent and decrease training time to 76 percent. Due to the low required bandwidth, distributed system on different geographical area and hardware such as a mobile laptop is possible.
Rights: Nanyang Technological University
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
final report.pdf
  Restricted Access
2.84 MBAdobe PDFView/Open

Page view(s)

checked on Sep 28, 2020


checked on Sep 28, 2020

Google ScholarTM


Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.