Please use this identifier to cite or link to this item:
https://hdl.handle.net/10356/176030
Title: | Music visualization with deep learning | Authors: | Kumar, Neel | Keywords: | Computer and Information Science | Issue Date: | 2024 | Publisher: | Nanyang Technological University | Source: | Kumar, N. (2024). Music visualization with deep learning. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/176030 | Abstract: | Music visualization offers a unique way to experience music beyond just listening. While dynamic visualizations are the status quo, our research has found a capacity for static visualizations to convey complex musical concepts. Moreover, with the advent of advancements in artificial intelligence and deep learning making it easier than ever to generate visualizations through technologies like DALL.E and Stable Diffusion, this study investigates its potential for generating static abstract visualizations of music, aiming to represent higher-level features such as mode, timbre, and symbolism. By leveraging technical advancements, particularly in transformer-based neural networks, this study explores a novel approach that combines music and natural language processing to create visual signatures that reflect the essence and emotional content of musical compositions. The findings demonstrate the model's capability to produce visually compelling and aesthetically pleasing representations of music, highlighting the underutilized potential of static visualizations in capturing complex musical attributes as well as identifying scopes for future improvement. Finally, the effectiveness of this approach was evaluated to test the hypothesis and usefulness of results. Several practical applications for visualizations such as enhancements to live and recorded performances, educational tools, therapeutic aids and artistic entertainment amongst others. While the results show promise, they underscore the need for refinement and further exploration to fully unlock the potential of this technology. Ultimately, the ability of this technology to create cross-modal understanding—capturing both general patterns and nuanced details—will determine their effectiveness in reshaping the intersection of audio and visual experiences. | URI: | https://hdl.handle.net/10356/176030 | Schools: | School of Computer Science and Engineering | Fulltext Permission: | restricted | Fulltext Availability: | With Fulltext |
Appears in Collections: | SCSE Student Reports (FYP/IA/PA/PI) |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
NeelKumar_MusicVisualization.pdf Restricted Access | FYP Poster | 558.25 kB | Adobe PDF | View/Open |
MusicVisualization_NeelKumar_finalReport.pdf Restricted Access | FYP Report | 1.89 MB | Adobe PDF | View/Open |
Page view(s)
190
Updated on May 7, 2025
Download(s) 50
46
Updated on May 7, 2025
Google ScholarTM
Check
Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.