Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/153200
Full metadata record
DC FieldValueLanguage
dc.contributor.authorWatcharasupat, Karn N.en_US
dc.date.accessioned2021-11-16T02:07:17Z-
dc.date.available2021-11-16T02:07:17Z-
dc.date.issued2021-
dc.identifier.citationWatcharasupat, K. N. (2021). Controllable music : supervised learning of disentangled representations for music generation. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/153200en_US
dc.identifier.urihttps://hdl.handle.net/10356/153200-
dc.description.abstractControllability, despite being a much-desired property of a generative model, remains an ill-defined concept that is difficult to measure. In the context of neural music generation, a controllable system often implies an intuitive interaction between human agents and the neural model, allowing the relatively opaque neural model to be controlled by a human in a semantically understandable manner. In this work, we aim to tackle controllable music generation in the raw audio domain, which is significantly less attempted compared to the symbolic domain. Specifically, we focus on controlling multiple continuous, potentially interdependent timbral attributes of a musical note using a variational autoencoder (VAE) framework, and the necessary groundwork research needed to support the goal. Specifically, this work consists of three main parts. The first formulates the concept of \textit{controllability} and how to evaluate a latent manifold of deep generative models in the presence of multiple interdependent attributes. The second focuses on the development of a composite latent space architecture for VAE, in order to allow encoding of interdependent attributes which having an easily sampled disentangled prior. Proofs of concept work for the second part was performed on several standard vision disentanglement learning datasets. Finally, the last part applies the composite latent space model on music generation in the raw audio domain and discusses the evaluation of the model against the criteria defined in the first part of this project. All in all, given the relatively uncharted nature of the controllable generation in the raw audio domain, this project provides a foundational work for the evaluation of controllable generation as a whole, and a promising proof of concept for musical audio generation with timbral control using variational autoencoders.en_US
dc.language.isoenen_US
dc.publisherNanyang Technological Universityen_US
dc.relationCY3001-211en_US
dc.subjectEngineering::Electrical and electronic engineering::Electronic systems::Signal processingen_US
dc.subjectEngineering::Computer science and engineering::Computing methodologies::Artificial intelligenceen_US
dc.titleControllable music : supervised learning of disentangled representations for music generationen_US
dc.typeFinal Year Project (FYP)en_US
dc.contributor.supervisorGan Woon Sengen_US
dc.contributor.schoolSchool of Electrical and Electronic Engineeringen_US
dc.description.degreeBachelor of Engineering (Electrical and Electronic Engineering)en_US
dc.contributor.organizationCenter for Music Technology, Georgia Institute of Technologyen_US
dc.contributor.researchCentre for Information Sciences and Systemsen_US
dc.contributor.supervisor2Alexander Lerchen_US
dc.contributor.supervisoremailalexander.lerch@gatech.edu; EWSGAN@ntu.edu.sgen_US
item.grantfulltextrestricted-
item.fulltextWith Fulltext-
Appears in Collections:EEE Student Reports (FYP/IA/PA/PI)
Files in This Item:
File Description SizeFormat 
OFYP_Final_Report.pdf
  Restricted Access
Final Report4.68 MBAdobe PDFView/Open

Page view(s)

169
Updated on Jan 31, 2023

Download(s) 50

32
Updated on Jan 31, 2023

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.