An average-value-at-risk criterion for Markov decision processes with unbounded costs

doi:10.1007/s11464-021-0944-3

Front. Math. China

2022, Vol. 17

Issue (4) : 673-687 https://doi.org/10.1007/s11464-021-0944-3

RESEARCH ARTICLE

An average-value-at-risk criterion for Markov decision processes with unbounded costs

Qiuli LIU¹, Wai-Ki CHING², Junyu ZHANG³(

), Hongchu WANG¹

¹. School of Mathematical Sciences, South China Normal University, Guangzhou 510631, China
². Advanced Modeling and Applied Computing Laboratory, Department of Mathematics, The University of Hong Kong, Hong Kong, China
³. School of Mathematics, Sun Yat-Sen University, Guangzhou 510275, China

Download: PDF(269 KB)
Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks

Abstract

We study the Markov decision processes under the average-valueat-risk criterion. The state space and the action space are Borel spaces, the costs are admitted to be unbounded from above, and the discount factors are state-action dependent. Under suitable conditions, we establish the existence of optimal deterministic stationary policies. Furthermore, we apply our main results to a cash-balance model.

Keywords Markov decision processes average-value-at-risk (AVaR) stateaction dependent discount factors optimal policy

Corresponding Author(s): Junyu ZHANG

Issue Date: 19 December 2022

Cite this article:

Qiuli LIU,Wai-Ki CHING,Junyu ZHANG, et al. An average-value-at-risk criterion for Markov decision processes with unbounded costs[J]. Front. Math. China, 2022, 17(4): 673-687.

URL:

https://academic.hep.com.cn/fmc/EN/10.1007/s11464-021-0944-3
https://academic.hep.com.cn/fmc/EN/Y2022/V17/I4/673

1	F Andersson , H Mausser , D Rosen , S Uryasev . Credit risk optimization with conditional value-at-risk criterion. Math Program, 2001, 89: 273- 291 https://doi.org/10.1007/PL00011399
2	N Bäuerle , J Ott . Markov decision processes with average-value-at-risk criteria. Math Methods Oper Res, 2011, 74: 361- 379 https://doi.org/10.1007/s00186-011-0367-0
3	N Bäuerle , U Rieder . Markov Decision Processes with Applications to Finance. Universitext. Heidelberg: Springer, 2011 https://doi.org/10.1007/978-3-642-18324-9
4	K Boda , J A Filar . Time consistent dynamic risk measures. Math Methods Oper Res, 2006, 63: 169- 186 https://doi.org/10.1007/s00186-005-0045-1
5	S Y Chu , Y Zhang . Markov decision processes with iterated coherent risk measures. Internat J Control, 2014, 87: 2286- 2293 https://doi.org/10.1080/00207179.2014.909947
6	X P Guo . Continuous-time Markov decision processes with discounted rewards: the case of Polish spaces. Math Oper Res, 2007, 32: 73- 87 https://doi.org/10.1287/moor.1060.0210
7	X P Guo , A Hernández-del-Valle , O Hernández-Lerma . Nonstationary discrete-time deterministic and stochastic control systems: bounded and unbounded cases. Systems Control Lett, 2011, 60: 503- 509 https://doi.org/10.1016/j.sysconle.2011.04.006
8	O Hernández-Lerma , J B Lasserre . Further Topics on Discrete-Time Markov Control Processes. New York: Springer-Verlag, 1999 https://doi.org/10.1007/978-1-4612-0561-6
9	Y H Huang , X P Guo . Minimum average value-at-risk for finite horizon semi-Markov decision processes in continuous time. SIAM J Optim, 2016, 26: 1- 28 https://doi.org/10.1137/140976029
10	J A Minjárez-Sosa . Markov control models with unknown random state-actiondependent discount factors. TOP, 2015, 23 (3): 743- 772 https://doi.org/10.1007/s11750-015-0360-5
11	M L Puterman . Markov Decision Processes: Discrete Stochastic Dynamic Programming. New York: John Wiley & Sons Inc, 1994 https://doi.org/10.2307/2291177
12	R T Rockafellar , S Uryasev . Optimization of conditional value-at-risk. J Risk, 2000, 2: 21- 41 https://doi.org/10.21314/JOR.2000.038
13	R T Rockafellar , S Uryasev . Conditional value-at-risk for general loss distributions. J Bank Finance, 2000, 26: 1443- 1471
14	K Uğurlu . Controlled Markov decision processes with AVaR criteria for unbounded costs. J Comput Appl Math, 2017, 319: 24- 37 https://doi.org/10.1016/j.cam.2016.11.052
15	Q D Wei , X P Guo . Markov decision processes with state-dependent discount factors and unbounded reward/costs. Oper Res Lett, 2011, 39 (5): 369- 374 https://doi.org/10.1016/j.orl.2011.06.014
16	L Xia . Optimization of Markov decision processes under the variance criterion. Automatica J IFAC, 2016, 73: 269- 278 https://doi.org/10.1016/j.automatica.2016.06.018
17	L Xia . Variance minimization of parameterized Markov decision processes. Discrete Event Dyn Syst, 2018, 28: 63- 81 https://doi.org/10.1007/s10626-017-0258-5
18	L Xia . Risk-sensitive Markov decision processes with combined metrics of mean and variance. Prod Oper Manag, 2020, 29 (12): 2808- 2827 https://doi.org/10.1111/poms.13252

[1]	Fang CHEN, Xianping GUO, Zhong-Wei LIAO. Optimal stopping time on discounted semi-Markov processes[J]. Front. Math. China, 2021, 16(2): 303-324.

Viewed

Full text

Abstract

Cited

Shared

Discussed