Please use this identifier to cite or link to this item:
Title: Make the computer talk
Authors: Chan, Tai Tat
Keywords: DRNTU::Engineering::Electrical and electronic engineering
Issue Date: 2016
Abstract: Speech synthesis is part of the advanced technology of artificial intelligence where the computer is able to talk. Text-to-Speech (TTS) synthesis is part of the speech synthesis technology where texts are converted to speech through various methods like articulatory, formant and concatenative synthesis. Concatenative synthesis is one of the most popular methods in TTS due to its ability to give more human-like sound Pre-recorded speech is concatenated together and its output is changed in terms of pitch and duration according to its suprasegmental features. Suprasegmental features represent the emotions and the meaning between the words and sentences. With the help of a Grammar model, the Grammar structure of a sentence can be determined and this can be a great aid in implementing suprasegmental features to speech signals. Finally, the ability to modify the pitch and duration of a speech signal is part of the speech processing field. There are many different algorithms of pitch marks detection and the algorithm of using fundamental frequency and enveloping is developed and discussed. The PSOLA method and the intelligibility of its output are discussed and a simple algorithm to improve the intelligibility of a speech signal undergoing pitch and duration modification is also developed and discussed.
Rights: Nanyang Technological University
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:EEE Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
  Restricted Access
10.57 MBAdobe PDFView/Open

Google ScholarTM


Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.