Artikulációs beszédszintézis megvalósítása dinamikus ultrahangfelvételek alapján

Trencsényi, Réka Eszter and Czap, László (2025) Artikulációs beszédszintézis megvalósítása dinamikus ultrahangfelvételek alapján. BESZÉDTUDOMÁNY / SPEECH SCIENCE, 5 (1). pp. 90-116. ISSN 2732-3773

Preview

Text
17316-Cikkszovege-76170-1-10-20250415.pdf - Published Version
Download (1MB) | Preview

Official URL: https://doi.org/10.15775/Besztud.2025.1.90-116

Abstract

Starting from 2D dynamic ultrasound sources recording the movement of the vocalorgans and the speech signal of the speaker in a simultaneous and synchronised manner,we produce machine speech by means of artificial intelligence. As visual objects,we use tongue and palate contours fitted automatically to the anatomic boundariesof the ultrasound images, and for training, we extract geometric information fromthese contours, as the change of their shape fundamentally describes the movement ofthe vocal organs during articulation. The geometric data consist of radial distancesbetween the tongue and palate contours and coefficients of the discrete cosine transformof the curves, respectively. Relying on this dataset, parameters connected to theacoustic content of the speech signal are trained by the network. These parameterscan be interpreted in the framework of the acoustic tube model of the vocal tract, andaccording to this, reflection coefficients and areas of the articulation channel are to betrained. In this study, sentences are synthesised using linear predictive coding and theacoustic tube model.

Item Type:	Article
Subjects:	P Language and Literature / nyelvészet és irodalom > P0 Philology. Linguistics / filológia, nyelvészet
SWORD Depositor:	MTMT SWORD
Depositing User:	MTMT SWORD
Date Deposited:	08 Jan 2026 09:14
Last Modified:	08 Jan 2026 09:14
URI:	https://real.mtak.hu/id/eprint/231692

Actions (login required)

Edit Item