|
|
A stable actor-critic algorithm for solving robotic tasks with multiple constraints |
Peiyao ZHAO, Fei ZHU( ), Quan LIU, Xinghong LING |
School of Computer Science and Technology, Soochow University, Suzhou 215006, China |
|
|
|
Corresponding Author(s):
Fei ZHU
|
Just Accepted Date: 07 September 2022
Issue Date: 05 December 2022
|
|
1 |
D, Silver J, Schrittwieser K, Simonyan I, Antonoglou A, Huang A, Guez T, Hubert L, Baker M, Lai A, Bolton Y, Chen T, Lillicrap F, Hui L, Sifre den Driessche G, van T, Graepel D Hassabis . Mastering the game of Go without human knowledge. Nature, 2017, 550( 7676): 354–359
|
2 |
A, Wachi Y Sui . Safe reinforcement learning in constrained Markov decision processes. In: Proceedings of the 37th International Conference on Machine Learning. 2020, 908
|
3 |
M, Yu Z, Yang M, Kolar Z Wang . Convergent policy optimization for safe reinforcement learning. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems. 2019, 281−293
|
4 |
S, Xiao L, Guo Z, Jiang L, Lv Y, Chen J, Zhu S Yang . Model-based constrained MDP for budget allocation in sequential incentive marketing. In: Proceedings of the 28th ACM International Conference on Information and Knowledge Management. 2019, 971−980
|
5 |
H M, Le C, Voloshin Y Yue . Batch policy learning under constraints. In: Proceedings of the 36th International Conference on Machine Learning. 2019, 3703−3712
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|