REAL

Phonetic Analysis and Automatic Prediction of Vowel Duration in Hungarian Spontaneous Speech

Beke, András and Gósy, Mária (2014) Phonetic Analysis and Automatic Prediction of Vowel Duration in Hungarian Spontaneous Speech. INTELLIGENT DECISION TECHNOLOGIES : An International Journal, 8 (4). pp. 301-314. ISSN 1872-4981

[img] Text
coginfofinal2013SPECIAL_FINAL.pdf
Restricted to Repository staff only

Download (3MB) | Request a copy

Abstract

A large number of phonetic and phonology research papers analyzed segmental durations focusing on factors and interactions that determine their durations. The results often play an important role in Language Technology applications, for example in TTS (text-to-speech synthesis), ASR (automatic speech recognition) and are widely used in infocommunication. Speech sound duration depends on various factors such as phonetic quality, phonological context, phonological position in the word or in the utterance, speech style, etc. The multifunction dependence of vowel duration is more complex in those languages where vowel length is a distinctive feature like in Hungarian. The main goal of the present research was to analyze the physical durations of pairs of vowels in spontaneous speech that exhibit a phonological length opposition. In addition, we intended to develop an algorithm for automatic classification of the short and long vowels occurring in spontaneous speech. On the basis of these findings we intended to predict automatically the vowel durations based on three different methods. The measured data confirmed our hypothesis that phonologically short vs. long vowels would significantly differ in their physical durations in spontaneous speech. The results of the automatic vowel length classification also supported this finding. The third aspect of our investigations was to use different supervised learning methods in order to predict vowel duration, based on different feature vectors consisting of characteristic and spectral features. The best result was yielded by the combined features and FFNN were used. The correlation between the target and the predicted vowel duration was 0.79 while RMSE was 25 ms. The results obtained support the complexity of features affecting vowel duration, on the one hand, and indicate the temporal complexity of segments in spontaneous speech, as has been reported for Lithuanian, Czech, Hindi, Telugu and Korean, on the other hand.

Item Type: Article
Subjects: P Language and Literature / nyelvészet és irodalom > P0 Philology. Linguistics / filológia, nyelvészet
P Language and Literature / nyelvészet és irodalom > PH Finno-Ugrian, Basque languages and literatures / finnugor és baszk nyelvek és irodalom > PH04 Hungarian language and literature / magyar nyelv és irodalom
Depositing User: Dávid Timár
Date Deposited: 22 Jan 2014 11:36
Last Modified: 22 May 2016 18:38
URI: http://real.mtak.hu/id/eprint/9060

Actions (login required)

Edit Item Edit Item