REAL

Does cross-validation work in telling rankings apart?

Sziklai, Balázs and Baranyi, Máté and Héberger, Károly (2024) Does cross-validation work in telling rankings apart? CENTRAL EUROPEAN JOURNAL OF OPERATIONS RESEARCH. ISSN 1435-246X (In Press)

[img]
Preview
Text
s10100-024-00932-1.pdf - Published Version
Available under License Creative Commons Attribution.

Download (2MB) | Preview

Abstract

Although cross-validation (CV) is a standard technique in machine learning and data science, its efficacy remains largely unexplored in ranking environments. When evaluating the significance of differences, cross-validation is typically coupled with statistical testing, such as the Dietterich, Alpaydin, or Wilcoxon test. In this paper, we evaluate the power and false positive error rate of the Dietterich, Alpaydin, and Wilcoxon statistical tests combined with cross-validation each operating with folds ranging from 5 to 10, resulting in a total of 18 variants. Our testing setup utilizes a ranking framework, similar to the Sum of Ranking Differences (SRD) statistical procedure: we assume the existence of a reference ranking, and distances are measured in L_1 L 1 -norm. We test the methods under artificial scenarios as well as on real data borrowed from sports and chemistry. The choice of the optimal CV test method depends on preferences related to the minimization of errors in type I and II cases, the size of the input, and anticipated patterns in the data. Among the investigated input sizes, the Wilcoxon method with eight folds proved to be the most effective, although its performance in type I situations is subpar. While the Dietterich and Alpaydin methods excel in type I situations, they perform poorly in type II scenarios. The inadequate performances of these tests raises questions about their efficacy outside of ranking environments too.

Item Type: Article
Uncontrolled Keywords: k-fold cross-validation · Rankings · Sum of ranking diferences · Wilcoxon test · Alpaydin test · Leave-many-out · Multi-criteria decision-making
Subjects: H Social Sciences / társadalomtudományok > HA Statistics / statisztika
SWORD Depositor: MTMT SWORD
Depositing User: MTMT SWORD
Date Deposited: 18 Sep 2024 14:10
Last Modified: 18 Sep 2024 14:10
URI: https://real.mtak.hu/id/eprint/205187

Actions (login required)

Edit Item Edit Item