REAL

Regeneration of Ultrasound Tongue Images using Tongue Position Values towards Articulatory-to-Acoustic Mapping

Ibrahimov, Ibrahim and Gosztolya, Gábor and Zainkó, Csaba (2025) Regeneration of Ultrasound Tongue Images using Tongue Position Values towards Articulatory-to-Acoustic Mapping. In: 3rd Workshop on Intelligent Infocommunication Networks, Systems and Services. Budapest University of Technology and Economics, Budapest, pp. 33-38. ISBN 9789634219828

[img]
Preview
Text
2025-wins-ssi-regeneration.pdf - Published Version

Download (1MB) | Preview

Abstract

Silent Speech Interfaces (SSI) facilitate communication for individuals unable to speak aloud by leveraging articulatory data. Ultrasound Tongue Imaging (UTI) is a widely adopted, non-invasive method for capturing tongue movements in real time, making it valuable for Articulation-to-Speech (ATS) synthesis. However, challenges such as probe misalignment, incomplete tongue images, and inter-speaker variability can affect data quality. To address these issues, recent advancements in deep learning, including neural vocoders and Tacotron2-based models, have improved ATS synthesis by enhancing naturalness and intelligibility. This study explores an alternative approach by regenerating ultrasound tongue images from tongue position values using DeepLabCut, aiming to refine articulatory-toacoustic mapping (AAM). Two different regeneration methods are investigated, and their effectiveness is assessed through visual and objective evaluations, including Structural SIMilarity (SSIM) and Mean Squared Error (MSE) metrics. While the regenerated images exhibit high SSIM scores, discrepancies in articulatory detail affect AAM performance, highlighting the need for further refinement. These findings contribute to the development of more robust UTI-based speech synthesis systems, with potential applications in assistive communication and articulatory training.

Item Type: Book Section
Uncontrolled Keywords: regeneration, ultrasound tongue imaging, DeepLabCut, articulatory-to-acoustic mapping
Subjects: Q Science / természettudomány > QA Mathematics / matematika > QA75 Electronic computers. Computer science / számítástechnika, számítógéptudomány
T Technology / alkalmazott, műszaki tudományok > TK Electrical engineering. Electronics Nuclear engineering / elektrotechnika, elektronika, atomtechnika
SWORD Depositor: MTMT SWORD
Depositing User: MTMT SWORD
Date Deposited: 04 Feb 2026 15:15
Last Modified: 04 Feb 2026 15:15
URI: https://real.mtak.hu/id/eprint/233307

Actions (login required)

Edit Item Edit Item