REAL

Copula-Based Anomaly Scoring and Localization for Large-Scale, High-Dimensional Continuous Data

Horváth, Gábor and Kovács, Edith Alice and Molontay, Roland and Nováczki, Szabolcs (2020) Copula-Based Anomaly Scoring and Localization for Large-Scale, High-Dimensional Continuous Data. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 11 (3). ISSN 2157-6904

[img]
Preview
Text
1912.02166.pdf
Available under License Creative Commons Attribution.

Download (3MB) | Preview

Abstract

The anomaly detection method presented by this paper has a special feature: it does not only indicate whether an observation is anomalous or not but also tells what exactly makes an anomalous observation unusual. Hence, it provides support to localize the reason of the anomaly. The proposed approach is model-based; it relies on the multivariate probability distribution associated with the observations. Since the rare events are present in the tails of the probability distributions, we use copula functions, that are able to model the fat-tailed distributions well. The presented procedure scales well; it can cope with a large number of high-dimensional samples. Furthermore, our procedure can cope with missing values, too, which occur frequently in high-dimensional data sets. In the second part of the paper, we demonstrate the usability of the method through a case study, where we analyze a large data set consisting of the performance counters of a real mobile telecommunication network. Since such networks are complex systems, the signs of sub-optimal operation can remain hidden for a potentially long time. With the proposed procedure, many such hidden issues can be isolated and indicated to the network operator.

Item Type: Article
Subjects: Q Science / természettudomány > QA Mathematics / matematika
SWORD Depositor: MTMT SWORD
Depositing User: MTMT SWORD
Date Deposited: 29 Jan 2024 14:17
Last Modified: 29 Jan 2024 14:18
URI: http://real.mtak.hu/id/eprint/186581

Actions (login required)

Edit Item Edit Item