REAL

Comparison of missing value imputation tools for machine learning models based on product development cases studies

Rácz, Anita and Gere, Attila (2025) Comparison of missing value imputation tools for machine learning models based on product development cases studies. LWT-FOOD SCIENCE AND TECHNOLOGY, 221. No.-117585. ISSN 0023-6438

[img]
Preview
Text
LWT_paper.pdf - Published Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.

Download (5MB) | Preview

Abstract

Datasets with missing values occur frequently in product development projects for lots of reasons, however there are several classical and machine learning based tools in our hands, which can be useful for missing value imputation instead of simply deleting valuable data. We have compared these imputation algorithms based on eight case studies with various missing value ratios from 0 to 0.5. Among the case studies, real-world (n = 4) and generated (n = 4) data sets were used. The machine learning models were developed with gradient boosting algorithms in a consistent way and 25 different performance parameters were used for the statistical comparison. The models were created for each missing value ratio scenario and with each different imputation method. With factorial ANOVA analysis, we verified the superiority of kNN algorithm as data imputation method for the real-world data sets, while the best methods for generated datasets proved to be the bayes and the lasso algorithms. It must be noted that the differences are less prominent in the case of 1% missing values. On the other hand, random forest algorithm (in mice package) is not recommended for the imputation of missing values.

Item Type: Article
Uncontrolled Keywords: Missing value, Imputation, Machine learning, Food, Gradient boosting, Classification
Subjects: H Social Sciences / társadalomtudományok > HA Statistics / statisztika
Q Science / természettudomány > Q1 Science (General) / természettudomány általában
SWORD Depositor: MTMT SWORD
Depositing User: MTMT SWORD
Date Deposited: 22 Sep 2025 12:07
Last Modified: 22 Sep 2025 12:07
URI: https://real.mtak.hu/id/eprint/224851

Actions (login required)

Edit Item Edit Item