Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/54391
Title: Real-time sociofeedback from audio and video signals
Authors: Zhao, Xiaozhi.
Keywords: DRNTU::Engineering::Computer science and engineering
Issue Date: 2013
Abstract: The paper concentrates on speaker diarization over meeting recordings. The task of speaker diarization is solve the question of “Who and When”, which means find who is speaking in the audio and when do they speak, there are two main steps in the speaker diarization, speaker segmentation and clustering, what the speaker segmentation do is find speaker change point in the audio, and the number of speakers and the when do each of them speaking can be solved in the clustering step. We adopt BIC algorithm and three typical type ICA algorithms as the experiment method. We only use BIC to implement speaker segmentation, thus the processing result of BIC is not labeled. And in our experiments, ICA is combined with speaker activity detection to implement speaker diarization. We will compare their performance in speaker segmentation, and results in BIC perform a little bit better than ICA algorithms, as the accuracy of BIC can reach 84.45%, compared with ICA algorithms AMUSE, JADE and FOBI, the error rate of them are separately 27.4%, 18.5% and 19.6%.
URI: http://hdl.handle.net/10356/54391
Schools: School of Electrical and Electronic Engineering 
Rights: Nanyang Technological University
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:EEE Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
eW4259-122.pdf
  Restricted Access
1.78 MBAdobe PDFView/Open

Page view(s) 50

417
Updated on Sep 23, 2023

Download(s) 50

22
Updated on Sep 23, 2023

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.