REAL

Changes in the results of voice biometric systems using different technologies in case of different speech tasks and voice sample lengths

Fejes, Attila and Sztahó, Dávid (2025) Changes in the results of voice biometric systems using different technologies in case of different speech tasks and voice sample lengths. BESZÉDTUDOMÁNY / SPEECH SCIENCE, 5 (2). pp. 132-153. ISSN 2732-3773

[img]
Preview
Text
17937-Cikk szövege-82514-2-10-20251013.pdf - Published Version

Download (882kB) | Preview

Abstract

During forensic speaker comparison, the audio forensics expert appointed to perform the investigation works with audio recordings of different types and durations. Distinct speech samples and durations affect the probability data. In order to evaluate biometric identification results, the probability value of the data obtained must be determined so that the expert’s report can be accurate and interpreted by other actors in the public proceedings. In the present study, the speech samples of 78 speakers from the forensic voice sample database were compared within the framework of the FORENSICSpeech research project (Beke et al., 2020). The samples include three different types of speech: spontaneous, read, and narration speech. The recording of the samples was repeated after an average of two weeks, and then the audio files were cut into 20, 40, 60, 80, 100, and 120 seconds in duration using automatic editing. The aim of this study is to show how different speech styles and durations affect voice biometric identification results. Results show that EER (Equal Error Rate) and FRR (False Reject Rate), Cllr (Log likelihood ratio cost) values decrease with increasing duration; however, in the 20–120- second range, the change is not continuous. Similarly, the lowest EER, FRR, Cllr, and Cllr- min values occur in the case of spontaneous speech, followed by narration, while the speech samples of information exchange give the highest Cllr values. The data as a whole is characterized by the fact that the more advanced i-vector method tends to provide more efficient, lower error-rate person identification results.

Item Type: Article
Subjects: P Language and Literature / nyelvészet és irodalom > P0 Philology. Linguistics / filológia, nyelvészet
SWORD Depositor: MTMT SWORD
Depositing User: MTMT SWORD
Date Deposited: 15 Oct 2025 07:46
Last Modified: 15 Oct 2025 07:46
URI: https://real.mtak.hu/id/eprint/226252

Actions (login required)

Edit Item Edit Item