Fundamental limitations of online supervised learning in dynamic control loops

Pintye, István and Kovács, József and Lovas, Róbert (2025) Fundamental limitations of online supervised learning in dynamic control loops. In: Proceedings of the International Conference on Formal Methods and Foundations of Artificial Intelligence. Eszterházy Károly Katolikus Egyetem Líceum Kiadó, Eger, pp. 161-173. ISBN 9789634963035

Preview

Text
fmfai2025_pp161-173.pdf - Published Version
Download (1MB) | Preview

Official URL: http://doi.org/10.17048/fmfai.2025.161

Abstract

In conventional supervised learning of neural networks, training samples are selected either randomly or in a predefined order, assuming independence across samples. This paper diverges from that setting by embedding the learning process within a dynamic control system. Specifically, we consider a discrete-time control system where the output is given by a nonlinear mapping, that dynamically adjusts the number of virtual machines (VMs) based on workload characteristics such as CPU, memory, and network usage. The system’s output is determined by a neural network that estimates the deviation from a target utilization profile. In online supervised learning embedded in feedback control, data generation is shaped by model performance, leading to a narrowing of the observed input distribution over time. This self-induced sampling bias may reduce model robustness, stability and adaptability. We demonstrate that simple periodic perturbations to the VM allocation process act as an effective form of regularization, improving learning robustness without relying on external reward or replay mechanisms. Unlike traditional approaches using fixed training sets, in our formulation the system operates online where at each time step, multiple candidate control inputs u[k] ∈ U are evaluated continuously and each yielding a predicted output y[k] = f(x[k]). At each step, the controller selects the action that minimizes the predicted deviation from the desired reference, which then determines the next state x[k + 1] and yields the next training sample for the neural network. As a result the learning trajectory is not predetermined but is dynamically created by the controller’s actions, which depend on the network’s current predictions. We present how this closed-loop interaction between prediction and sample selection influences learning stability, convergence, and input space coverage in an online setting.

Item Type:	Book Section
Additional Information:	International Conference on Formal Methods and Foundations of Artificial Intelligence, Eger, June 5–7, 2025
Uncontrolled Keywords:	online learning, closed-loop control, neural networks, cloud resource allocation, distributional shift, adaptive systems
Subjects:	Q Science / természettudomány > QA Mathematics / matematika > QA75 Electronic computers. Computer science / számítástechnika, számítógéptudomány
Depositing User:	Tibor Gál
Date Deposited:	30 Oct 2025 13:22
Last Modified:	30 Oct 2025 14:35
URI:	https://real.mtak.hu/id/eprint/227755

Actions (login required)

Edit Item