Please use this identifier to cite or link to this item:
https://hdl.handle.net/10356/144846
Title: | Differentially private deep learning for time series data | Authors: | Dwitami, Inggriany | Keywords: | Science::Mathematics::Statistics Science::Mathematics::Discrete mathematics::Cryptography |
Issue Date: | 2020 | Publisher: | Nanyang Technological University | Abstract: | Machine learning applications based on neural networks are becoming increasingly widespread. This condition gives rise to questions regarding the data subjects' privacy, as some of the data sets used may contain sensitive information. To address this, a new concept of privacy was formalized, namely Differential Privacy, and it gave rise to various implementations to satisfy these privacy criterion in the context of machine learning, with Differentially Private Stochastic Gradient Descent (DP-SGD) being one of the most prominent. It fulfills the privacy criterion by adding noise and clipping gradients in addition to the usual SGD algorithm. For this experimental purposes, time series data is chosen over its image counterpart due to the fact that it is less computationally heavy. The UCR archive and the Medical Information Mart for Intensive Care (MIMIC-III) database are some examples of publicly available data sets which contain time series data which may be formulated as a time series classification problem. The UCR archive encompasses a wide variety of subjects while the MIMIC-III database focuses on Electronic Health Records (EHRs). For the latter, the concept of privacy is critical to protect the patients' privacy, hence making it suitable for applying differentially-private training. In this paper, experiments were conducted on the UCR archive and MIMIC-III database to evaluate the effects of using DP-SGD on models' performance, focusing particularly to Long Short Term Memory (LSTM) and Fully Convolutional Neural Network (FCN). The result shows that in general, models trained without the differentially private optimizer tend to outperform those with, which is expected as data utility is traded off for privacy. However, the difference in performance is sometimes small, and even not significant. Furthermore, the added noise in DP-SGD can also act as regularizer to prevent overfitting. This paper recommends that future work must be done to further generalize the result of this experiments. This includes providing a publicly available benchmark data sets, incorporating more models, and comparing various differentially private frameworks. | URI: | https://hdl.handle.net/10356/144846 | Schools: | School of Physical and Mathematical Sciences | Organisations: | Institute for Infocomm Research (I2R), A*STAR | Fulltext Permission: | restricted | Fulltext Availability: | With Fulltext |
Appears in Collections: | SPMS Student Reports (FYP/IA/PA/PI) |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
FYP_U1740540J.pdf Restricted Access | FYP | 11.08 MB | Adobe PDF | View/Open |
Page view(s)
433
Updated on Mar 22, 2025
Download(s) 50
75
Updated on Mar 22, 2025
Google ScholarTM
Check
Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.