Rácz, Anita and Gere, Attila (2025) Comparison of missing value imputation tools for machine learning models based on product development cases studies. LWT-FOOD SCIENCE AND TECHNOLOGY, 221. No.-117585. ISSN 0023-6438
|
Text
LWT_paper.pdf - Published Version Available under License Creative Commons Attribution Non-commercial No Derivatives. Download (5MB) | Preview |
Abstract
Datasets with missing values occur frequently in product development projects for lots of reasons, however there are several classical and machine learning based tools in our hands, which can be useful for missing value imputation instead of simply deleting valuable data. We have compared these imputation algorithms based on eight case studies with various missing value ratios from 0 to 0.5. Among the case studies, real-world (n = 4) and generated (n = 4) data sets were used. The machine learning models were developed with gradient boosting algorithms in a consistent way and 25 different performance parameters were used for the statistical comparison. The models were created for each missing value ratio scenario and with each different imputation method. With factorial ANOVA analysis, we verified the superiority of kNN algorithm as data imputation method for the real-world data sets, while the best methods for generated datasets proved to be the bayes and the lasso algorithms. It must be noted that the differences are less prominent in the case of 1% missing values. On the other hand, random forest algorithm (in mice package) is not recommended for the imputation of missing values.
| Item Type: | Article |
|---|---|
| Uncontrolled Keywords: | Missing value, Imputation, Machine learning, Food, Gradient boosting, Classification |
| Subjects: | H Social Sciences / társadalomtudományok > HA Statistics / statisztika Q Science / természettudomány > Q1 Science (General) / természettudomány általában |
| SWORD Depositor: | MTMT SWORD |
| Depositing User: | MTMT SWORD |
| Date Deposited: | 22 Sep 2025 12:07 |
| Last Modified: | 22 Sep 2025 12:07 |
| URI: | https://real.mtak.hu/id/eprint/224851 |
Actions (login required)
![]() |
Edit Item |




