Please wait a minute...
Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

Postal Subscription Code 80-970

2018 Impact Factor: 1.129

Front Comput Sci    2013, Vol. 7 Issue (5) : 754-766    https://doi.org/10.1007/s11704-013-2291-3
REVIEW ARTICLE
Reinforcement learning models for scheduling in wireless networks
Kok-Lim Alvin YAU1(), Kae Hsiang KWONG2, Chong SHEN3
1. Faculty of Science and Technology, Sunway University, Selangor 46150, Malaysia; 2. R&D Department, Recovision, Selangor 47650, Malaysia; 3. College of Information Science and Technology, Hainan University, Haikou 570228, China
 Download: PDF(430 KB)   HTML
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

The dynamicity of available resources and network conditions, such as channel capacity and traffic characteristics, have posed major challenges to scheduling in wireless networks. Reinforcement learning (RL) enables wireless nodes to observe their respective operating environment, learn, and make optimal or near-optimal scheduling decisions. Learning, which is the main intrinsic characteristic of RL, enables wireless nodes to adapt to most forms of dynamicity in the operating environment as time goes by. This paper presents an extensive review on the application of the traditional and enhanced RL approaches to various types of scheduling schemes, namely packet, sleep-wake and task schedulers, in wireless networks, as well as the advantages and performance enhancements brought about by RL. Additionally, it presents how various challenges associated with scheduling schemes have been approached using RL. Finally, we discuss various open issues related to RL-based scheduling schemes in wireless networks in order to explore new research directions in this area. Discussions in this paper are presented in a tutorial manner in order to establish a foundation for further research in this field.

Keywords reinforcement learning      scheduling      wireless networks     
Corresponding Author(s): YAU Kok-Lim Alvin,Email:koklimy@sunway.edu.my   
Issue Date: 01 October 2013
 Cite this article:   
Kok-Lim Alvin YAU,Kae Hsiang KWONG,Chong SHEN. Reinforcement learning models for scheduling in wireless networks[J]. Front Comput Sci, 2013, 7(5): 754-766.
 URL:  
https://academic.hep.com.cn/fcs/EN/10.1007/s11704-013-2291-3
https://academic.hep.com.cn/fcs/EN/Y2013/V7/I5/754
1 Sutton R S, Barto A G. Reinforcement learning: an introduction. US: MIT Press, 1998
2 Stidham S J. Applied probability in operations research: a retrospective//Preprint: analysis, design, and control of queueing systems. Operation Research , 2002, 50(1): 197-216
doi: 10.1287/opre.50.1.197.17783
3 Thompson M S, Mackenzie A B, Dasilva L A, Hadjichristofi G. A mobile ad hoc networking competition: a retrospective look at the MANIAC challenge. IEEE Communications Magazine , 2012, 50(7): 121-127
doi: 10.1109/MCOM.2012.6231288
4 Li X, Falcon R, Nayak A, . Stojmenovic I. Servicing wireless sensor networks by mobile robots. IEEE Communications Magazine , 2012, 50(7): 147-154
doi: 10.1109/MCOM.2012.6231291
5 Xue Y, Lin Y, Cai H, Chi C. Autonomic joint session scheduling strategies for heterogeneous wireless networks. In: Proceedings of the 2008 IEEE Wireless Communications and Networking Conference . 2008, 2045-2050
doi: 10.1109/WCNC.2008.363
6 Song M, Xin C, Zhao Y, Cheng X. Dynamic spectrum access: from cognitive radio to network radio. IEEE Wireless Communications , 2012, 19(1): 23-29
doi: 10.1109/MWC.2012.6155873
7 Mao J, Xiang F, Lai H. RL-based superframe order adaptation algorithm for IEEE 802.15.4 networks. In: Proceedings of the 2009 Chinese Control and Decision Conference . 2009, 4708-4711
8 Shah K, Kumar M. Distributed independent reinforcement learning (DIRL) approach to resource management in wireless sensor networks. In: Proceedings of the 4th International Conference on Mobile Ad-hoc and Sensor Systems . 2007, 1-9
9 Niu J. Self-learning scheduling approach for wireless sensor network. In: Proceedings of the 2010 International Conference on Future Computer and Communication . 2010, 253-257
doi: 10.1109/ICFCC.2010.5497643
10 Kaelbling L P, Littman M L, Wang X. Reinforcement learning: a survey. Journal of Artificial Intelligence Research , 1996, 4: 237-285
11 Bourenane M. Adaptive scheduling in mobile ad hoc networks using reinforcement learning approach. In: Proceedings of the 9th International Conference on Innovations in Information Technology . 2011, 392-397
12 Felice M D, Chowdhury K R, Kassler A, Bononi L. Adaptive sensing scheduling and spectrum selection in cognitive wireless mesh networks. In: Proceedings of the 2011 International Conference on Computer Communication Networks . 2011, 1-6
13 Zouaidi S, Mellouk A, Bourennane M, Hoceini S. Design and performance analysis of inductive QoS scheduling for dynamic network routing. In: Proceedings of the 20th Conference on Software, Telecomm, Computer Networks . 2008, 140-146
14 Sallent O, Pérez-Romero J, Sánchez-González J, Agustí R, Díazguerra MA, Henche D, Paul D. A roadmap from UMTS optimization to LTE self-optimization. IEEE Communications Magazine , 2011, 49(6): 172-182
doi: 10.1109/MCOM.2011.5784003
15 Bobarshad H, van der Schaar M, Aghvami A H, Dilmaghani R S, Shikh-Bahaei M RAnalytical modeling for delay-sensitive video over WLAN. IEEE Transactions on Multimedia , 2012, 14(2): 401-414
doi: 10.1109/TMM.2011.2173477
16 Liu Z, Elhanany I. RL-MAC: a QoS-aware reinforcement learning based MAC protocol for wireless sensor networks. In: Proceedings of the 2006 Conference on Networking, Sensing and Control . 2006, 768-773
17 Yu R, Sun Z, Mei S. Packet scheduling in broadband wireless networks using neuro-dynamic programming. In: Proceedings of the 65th IEEE Vehicular Technology Conference . 2007, 2276-2780
18 Khan M I, Rinner B. Resource coordination in wireless sensor net works by cooperative reinforcement learning. In: Proceedings of the 2012 IEEE International Conference on Pervasive Computing and Communications . 2012, 895-900
19 Kok J R, Vlassis N. Collaborative multiagent reinforcement learning by payoff propagation. Journal of Machine Learning Research , 2006, 7: 1789-1828
20 Schneider J, Wong W-K, Moore A, Riedmiller M, Distributed value functions. In: Proceedings of the 16th Conference on Machine Learning . 1999, 371-378
21 Sahoo A, Manjunath D. Revisiting WFQ: minimum packet lengths tighten delay and fairness bounds. IEEE Communications Letters , 2007, 11(4): 366-368
22 Yu H, Ding L, Liu N, Pan Z, Wu P, You X. Enhanced first-in-first-outbased round-robin multicast scheduling algorithm for input-queued switches. IET Communications , 2011, 5(8): 1163-1171
doi: 10.1049/iet-com.2010.0378
23 Yau K L A, Komisarczuk P, Teal P D. Enhancing network performance in distributed cognitive radio networks using single-agent and multi-agent reinforcement learning. In: Proceedings of the 2010 Conference on Local Computer Networks . 2010, 152-159
24 . Engineering Systems Division (ESD). ESD Symposium Committee Overview. In: Proceedings of Massachusetts Institute of Technology ESD Internal Symposium . 2002. http://esd.mit.edu/WPS
25 Ouzecki D, Jevtic D. Reinforcement learning as adaptive network routing of mobile agents. In: Proceedings of the 33rd International Conference on Information and Communication Technology. Electronics and Microelectronics . 2010, 479-484
26 Bhorkar A A, Naghshvar M, Javidi T, Rao B D. Adaptive opportunistic routing for wireless ad hoc networks. IEEE/ACM Transactions on Network , 2012, 20(1): 243-256
doi: 10.1109/TNET.2011.2159844
27 Lin Z, Schaar M V DAutonomic and distributed joint routing and power control for delay-sensitive applications in multi-hop wireless networks. IEEE Transactions on Wireless Communications , 2011, 10(1): 102-113
doi: 10.1109/TWC.2010.111910.091238
28 Santhi G, Nachiappan A, Ibrahime M Z, Raghunadhane R, Favas M K. Q-learning based adaptive QoS routing protocol for MANETs. In: Proceedings of the 2011 International Conference on Recent Trends in Information Technology . 2011, 1233-1238
doi: 10.1109/ICRTIT.2011.5972411
[1] Han Yao HUANG, Kyung Tae KIM, Hee Yong YOUN. Determining node duty cycle using Q-learning and linear regression for WSN[J]. Front. Comput. Sci., 2021, 15(1): 151101-.
[2] Zeinab ASKARI, Avid AVOKH. EMSC: a joint multicast routing, scheduling, and call admission control in multi–radio multi–channel WMNs[J]. Front. Comput. Sci., 2020, 14(5): 145503-.
[3] Libing WU, Lei NIE, Samee U. KHAN, Osman KHALID, Dan WU. A V2I communication-based pipeline model for adaptive urban traffic light scheduling[J]. Front. Comput. Sci., 2019, 13(5): 929-942.
[4] Lin WANG, Depei QIAN, Rui WANG, Zhongzhi LUAN, Hailong YANG, Huaxiang ZHANG. A novel index system describing program runtime characteristics for workload consolidation[J]. Front. Comput. Sci., 2019, 13(3): 489-499.
[5] Yihong GAO, Huadong MA. StreamTune: dynamic resource scheduling approach for workload skew in video data center[J]. Front. Comput. Sci., 2018, 12(4): 669-681.
[6] Jun ZHANG, Jiangtao WEN, Yuxing HAN. TCP-ACC: performance and analysis of an active congestion control algorithm for heterogeneous networks[J]. Front. Comput. Sci., 2017, 11(6): 1061-1074.
[7] Mei BAI,Junchang XIN,Guoren WANG,Roger ZIMMERMANN,Xite WANG. Skyline-join query processing in distributed databases[J]. Front. Comput. Sci., 2016, 10(2): 330-352.
[8] Qi WANG,Donghui WANG,Chaohuan HOU. Exploiting write power asymmetry to improve phase change memory system performance[J]. Front. Comput. Sci., 2015, 9(4): 566-575.
[9] Xite WANG,Derong SHEN,Mei BAI,Tiezheng NIE,Yue KOU,Ge YU. SAMES: deadline-constraint scheduling in MapReduce[J]. Front. Comput. Sci., 2015, 9(1): 128-141.
[10] JongHyuk LEE,SungJin CHOI,JoonMin GIL,Taeweon SUH,HeonChang YU. A scheduling algorithm with dynamic properties in mobile grid[J]. Front. Comput. Sci., 2014, 8(5): 847-857.
[11] Najme MANSOURI. Network and data location aware approach for simultaneous job scheduling and data replication in large-scale data grid environments[J]. Front. Comput. Sci., 2014, 8(3): 391-408.
[12] Huafeng YU, Yue MA, Thierry GAUTIER, Lo?c BESNARD, Jean-Pierre TALPIN, Paul Le GUERNIC, Yves SOREL. Exploring system architectures in AADL via Polychrony and SynDEx[J]. Front Comput Sci, 2013, 7(5): 627-649.
[13] Kenli LI, Zhao TONG, Dan LIU, Teklay TESFAZGHI, Xiangke LIAO. A PTS-PGATS based approach for data-intensive scheduling in data grids[J]. Front Comput Sci Chin, 2011, 5(4): 513-525.
[14] Zheng LIU, Heng DAI, Farouk ALKADHI, Jufeng DAI, . An effective scheduling scheme for multi-hop multicast in wireless mesh networks[J]. Front. Comput. Sci., 2010, 4(1): 135-142.
[15] Yun HU, Shoubao YANG, Qi ZHANG, Dapeng WANG, Qinwei SHEN, . A new parallel scheduling system for multiple radio wireless mesh network[J]. Front. Comput. Sci., 2009, 3(4): 550-559.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed