Please wait a minute...
Frontiers of Mathematics in China

ISSN 1673-3452

ISSN 1673-3576(Online)

CN 11-5739/O1

Postal Subscription Code 80-964

2018 Impact Factor: 0.565

Front. Math. China    2022, Vol. 17 Issue (4) : 673-687    https://doi.org/10.1007/s11464-021-0944-3
RESEARCH ARTICLE
An average-value-at-risk criterion for Markov decision processes with unbounded costs
Qiuli LIU1, Wai-Ki CHING2, Junyu ZHANG3(), Hongchu WANG1
1. School of Mathematical Sciences, South China Normal University, Guangzhou 510631, China
2. Advanced Modeling and Applied Computing Laboratory, Department of Mathematics, The University of Hong Kong, Hong Kong, China
3. School of Mathematics, Sun Yat-Sen University, Guangzhou 510275, China
 Download: PDF(269 KB)  
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

We study the Markov decision processes under the average-valueat-risk criterion. The state space and the action space are Borel spaces, the costs are admitted to be unbounded from above, and the discount factors are state-action dependent. Under suitable conditions, we establish the existence of optimal deterministic stationary policies. Furthermore, we apply our main results to a cash-balance model.

Keywords Markov decision processes      average-value-at-risk (AVaR)      stateaction dependent discount factors      optimal policy     
Corresponding Author(s): Junyu ZHANG   
Issue Date: 19 December 2022
 Cite this article:   
Qiuli LIU,Wai-Ki CHING,Junyu ZHANG, et al. An average-value-at-risk criterion for Markov decision processes with unbounded costs[J]. Front. Math. China, 2022, 17(4): 673-687.
 URL:  
https://academic.hep.com.cn/fmc/EN/10.1007/s11464-021-0944-3
https://academic.hep.com.cn/fmc/EN/Y2022/V17/I4/673
1 F Andersson , H Mausser , D Rosen , S Uryasev . Credit risk optimization with conditional value-at-risk criterion. Math Program, 2001, 89: 273- 291
https://doi.org/10.1007/PL00011399
2 N Bäuerle , J Ott . Markov decision processes with average-value-at-risk criteria. Math Methods Oper Res, 2011, 74: 361- 379
https://doi.org/10.1007/s00186-011-0367-0
3 N Bäuerle , U Rieder . Markov Decision Processes with Applications to Finance. Universitext. Heidelberg: Springer, 2011
https://doi.org/10.1007/978-3-642-18324-9
4 K Boda , J A Filar . Time consistent dynamic risk measures. Math Methods Oper Res, 2006, 63: 169- 186
https://doi.org/10.1007/s00186-005-0045-1
5 S Y Chu , Y Zhang . Markov decision processes with iterated coherent risk measures. Internat J Control, 2014, 87: 2286- 2293
https://doi.org/10.1080/00207179.2014.909947
6 X P Guo . Continuous-time Markov decision processes with discounted rewards: the case of Polish spaces. Math Oper Res, 2007, 32: 73- 87
https://doi.org/10.1287/moor.1060.0210
7 X P Guo , A Hernández-del-Valle , O Hernández-Lerma . Nonstationary discrete-time deterministic and stochastic control systems: bounded and unbounded cases. Systems Control Lett, 2011, 60: 503- 509
https://doi.org/10.1016/j.sysconle.2011.04.006
8 O Hernández-Lerma , J B Lasserre . Further Topics on Discrete-Time Markov Control Processes. New York: Springer-Verlag, 1999
https://doi.org/10.1007/978-1-4612-0561-6
9 Y H Huang , X P Guo . Minimum average value-at-risk for finite horizon semi-Markov decision processes in continuous time. SIAM J Optim, 2016, 26: 1- 28
https://doi.org/10.1137/140976029
10 J A Minjárez-Sosa . Markov control models with unknown random state-actiondependent discount factors. TOP, 2015, 23 (3): 743- 772
https://doi.org/10.1007/s11750-015-0360-5
11 M L Puterman . Markov Decision Processes: Discrete Stochastic Dynamic Programming. New York: John Wiley & Sons Inc, 1994
https://doi.org/10.2307/2291177
12 R T Rockafellar , S Uryasev . Optimization of conditional value-at-risk. J Risk, 2000, 2: 21- 41
https://doi.org/10.21314/JOR.2000.038
13 R T Rockafellar , S Uryasev . Conditional value-at-risk for general loss distributions. J Bank Finance, 2000, 26: 1443- 1471
14 K Uğurlu . Controlled Markov decision processes with AVaR criteria for unbounded costs. J Comput Appl Math, 2017, 319: 24- 37
https://doi.org/10.1016/j.cam.2016.11.052
15 Q D Wei , X P Guo . Markov decision processes with state-dependent discount factors and unbounded reward/costs. Oper Res Lett, 2011, 39 (5): 369- 374
https://doi.org/10.1016/j.orl.2011.06.014
16 L Xia . Optimization of Markov decision processes under the variance criterion. Automatica J IFAC, 2016, 73: 269- 278
https://doi.org/10.1016/j.automatica.2016.06.018
17 L Xia . Variance minimization of parameterized Markov decision processes. Discrete Event Dyn Syst, 2018, 28: 63- 81
https://doi.org/10.1007/s10626-017-0258-5
18 L Xia . Risk-sensitive Markov decision processes with combined metrics of mean and variance. Prod Oper Manag, 2020, 29 (12): 2808- 2827
https://doi.org/10.1111/poms.13252
[1] Fang CHEN, Xianping GUO, Zhong-Wei LIAO. Optimal stopping time on discounted semi-Markov processes[J]. Front. Math. China, 2021, 16(2): 303-324.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed