Hollanders, Romain and Gerencsér, Balázs and Delvenne, Jean-Charles and Jungers, Raphaël M. (2016) Improved bound on the worst case complexity of Policy Iteration. OPERATIONS RESEARCH LETTERS, 44 (2). pp. 267-272. ISSN 0167-6377
![]() |
Text
1_s2.0_S0167637716000201_main_u.pdf - Published Version Restricted to Repository staff only Download (422kB) | Request a copy |
|
|
Text
1410.7583.pdf Download (212kB) | Preview |
Abstract
Solving Markov Decision Processes is a recurrent task in engineering which can be performed efficiently in practice using the Policy Iteration algorithm. Regarding its complexity, both lower and upper bounds are known to be exponential (but far apart) in the size of the problem. In this work, we provide the first improvement over the now standard upper bound from Mansour and Singh (1999). We also show that this bound is tight for a natural relaxation of the problem.
Item Type: | Article |
---|---|
Uncontrolled Keywords: | Acyclic Unique Sink Orientation; Factored Markov decision process; COMPLEXITY; Policy iteration |
Subjects: | Q Science / természettudomány > QA Mathematics / matematika |
SWORD Depositor: | MTMT SWORD |
Depositing User: | MTMT SWORD |
Date Deposited: | 03 Jan 2017 17:26 |
Last Modified: | 03 Jan 2017 17:26 |
URI: | http://real.mtak.hu/id/eprint/44200 |
Actions (login required)
![]() |
Edit Item |