Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/137910
Full metadata record
DC FieldValueLanguage
dc.contributor.authorKan, Shawn Jung Tzeen_US
dc.date.accessioned2020-04-18T03:35:27Z-
dc.date.available2020-04-18T03:35:27Z-
dc.date.issued2020-
dc.identifier.urihttps://hdl.handle.net/10356/137910-
dc.description.abstractThere is a need to better understand how generalization works in a deep learning model. The goal of this paper is to provide a clearer view of the black box called neural network. This is done by using information theory to compute the flow of information within a network. The proposed framework uses an indicator that computes the mutual information of all hidden layers within the deep learning model. The indicator represents the predictive capabilities of the neural network. The evolution of the indicator provides another level of analysis with regards to the generalization capabilities. By using information theory, we can express the flow of information within a previously unseen black box. The framework provides the capability to analyse a deep learning model. It is a conceptual platform where users can perform analysis using the functions provided. Functions include computing mutual information, indicator usage, and visualization. Experiments were conducted using different methods to compute mutual information and its effects on deep learning models. It is found that our method based on the indicator overcome the shortcomings of using non-linear information bottleneck objective function. Our method computes the average mutual information of all hidden layers and this produces a better estimate as compared to the objective function which computes the mutual information of an intermediate representation. The advantage of the proposed framework includes that it is not restricted to a certain kind of neural network. Furthermore, it uses probability distribution functions which means the framework does not rely on the presence of the actual dataset. The focus of this paper is the use rather than the computation of mutual information. Therefore, methods such as non-linear information bottleneck or a neural network trained to estimate the mutual information of given datasets can be used to compute mutual information. The framework also provides a general solution to observe the learning process of a neural network and it is available at a public repositoryen_US
dc.language.isoenen_US
dc.publisherNanyang Technological Universityen_US
dc.relationSCSE19-0092en_US
dc.subjectEngineering::Computer science and engineering::Information systemsen_US
dc.titleUsing mutual information to evaluate the generalization capability of deep learning neural networksen_US
dc.typeFinal Year Project (FYP)en_US
dc.contributor.supervisorAlthea Liangen_US
dc.contributor.schoolSchool of Computer Science and Engineeringen_US
dc.description.degreeBachelor of Engineering (Computer Science)en_US
dc.contributor.supervisoremailqhliang@ntu.edu.sgen_US
item.grantfulltextrestricted-
item.fulltextWith Fulltext-
Appears in Collections:SCSE Student Reports (FYP/IA/PA/PI)
Files in This Item:
File Description SizeFormat 
Shawn Kan U1721495G Final Report.pdf
  Restricted Access
907.42 kBAdobe PDFView/Open

Page view(s)

199
Updated on Jan 30, 2023

Download(s)

14
Updated on Jan 30, 2023

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.