REAL

Speaker Adaptation Experiments with Limited Data for End-to-End Text-To-Speech Synthesis using Tacotron2

Mandeel, Ali Raheem and Al-Radhi, Mohammed Salah and Csapó, Tamás Gábor (2022) Speaker Adaptation Experiments with Limited Data for End-to-End Text-To-Speech Synthesis using Tacotron2. INFOCOMMUNICATIONS JOURNAL, 14 (3). pp. 55-62. ISSN 2061-2079

[img]
Preview
Text
InfocomJ_2022_3_7_Mandeel.pdf

Download (1MB) | Preview

Abstract

Speech synthesis has the aim of generating humanlike speech from text. Nowadays, with end-to-end systems, highly natural synthesized speech can be achieved if a large enough dataset is available from the target speaker. However, often it would be necessary to adapt to a target speaker for whom only a few training samples are available. Limited data speaker adaptation might be a difficult problem due to the overly few training samples. Issues might appear with a limited speaker dataset, such as the irregular allocation of linguistic tokens (i.e., some speech sounds are left out from the synthesized speech). To build lightweight systems, measuring the number of minimum data samples and training epochs is crucial to acquire a reasonable quality. We conducted detailed experiments with four target speakers for adaptive speaker text-to-speech (TTS) synthesis to show the performance of the end-to-end Tacotron2 model and the WaveGlow neural vocoder with an English dataset at several training data samples and training lengths. According to our investigation of objective and subjective evaluations, the Tacotron2 model exhibits good performance in terms of speech quality and similarity for unseen target speakers at 100 sentences of data (pair of text and audio) with a relatively low training time.

Item Type: Article
Subjects: Q Science / természettudomány > QA Mathematics / matematika > QA75 Electronic computers. Computer science / számítástechnika, számítógéptudomány
SWORD Depositor: MTMT SWORD
Depositing User: MTMT SWORD
Date Deposited: 02 Dec 2022 09:33
Last Modified: 02 Dec 2022 09:33
URI: http://real.mtak.hu/id/eprint/154219

Actions (login required)

Edit Item Edit Item