Please wait a minute...
Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

Postal Subscription Code 80-970

2018 Impact Factor: 1.129

Front. Comput. Sci.    2018, Vol. 12 Issue (5) : 939-949    https://doi.org/10.1007/s11704-017-6315-2
RESEARCH ARTICLE
Achieving data-driven actionability by combining learning and planning
Qiang LV(), Yixin CHEN, Zhaorong LI, Zhicheng CUI, Ling CHEN, Xing ZHANG, Haihua SHEN()
1. College of Information Engineering, Yangzhou University, Yangzhou 225127, China
2. Department of Computer Science and Engineering,Washington University in St. Louis, St. Louis MO 63130, USA
3. School of Management, Fudan University, Shanghai 200433, China
4. School of Computer and Control Engineering, University of Chinese Academy of Science, Beijing 100049, China
 Download: PDF(437 KB)  
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

A main focus of machine learning research has been improving the generalization accuracy and efficiency of prediction models. However, what emerges as missing in many applications is actionability, i.e., the ability to turn prediction results into actions. Existing effort in deriving such actionable knowledge is few and limited to simple action models while in many real applications those models are often more complex and harder to extract an optimal solution. In this paper, we propose a novel approach that achieves actionability by combining learning with planning, two core areas of AI. In particular, we propose a framework to extract actionable knowledge from random forest, one of the most widely used and best off-the-shelf classifiers. We formulate the actionability problem to a sub-optimal action planning (SOAP) problem, which is to find a plan to alter certain features of a given input so that the random forest would yield a desirable output, while minimizing the total costs of actions. Technically, the SOAP problem is formulated in the SAS+ planning formalism, and solved using a Max-SAT based approach. Our experimental results demonstrate the effectiveness and efficiency of the proposed approach on a personal credit dataset and other benchmarks. Our work represents a new application of automated planning on an emerging and challenging machine learning paradigm.

Keywords actionable knowledge extraction      machine learning      planning      random forest     
Corresponding Author(s): Qiang LV,Haihua SHEN   
Just Accepted Date: 05 January 2017   Online First Date: 06 March 2018    Issue Date: 21 September 2018
 Cite this article:   
Qiang LV,Yixin CHEN,Zhaorong LI, et al. Achieving data-driven actionability by combining learning and planning[J]. Front. Comput. Sci., 2018, 12(5): 939-949.
 URL:  
https://academic.hep.com.cn/fcs/EN/10.1007/s11704-017-6315-2
https://academic.hep.com.cn/fcs/EN/Y2018/V12/I5/939
1 Mitchell T M. Machine learning and data mining. Communications of the ACM, 1999, 42(11): 30–36
https://doi.org/10.1145/319382.319388
2 Bailey T C, Chen Y X, Mao Y, Lu C Y, Hackmann G, Micek S T, Heard K M, Faulkner K M, Kollef M H. A trial of a real-time alert for clinical deterioration in patients hospitalized on general medical wards. Journal of Hospital Medicine, 2013, 8: 236–242
https://doi.org/10.1002/jhm.2009
3 Johnson R A, Gong R, Greatorex-Voith S, Anand A, Fritzler A. A datadriven framework for identifying high school students at risk of not graduating on time. Bloomberg Data for Good Exchange, 2015
4 Liu B, Hsu W. Post-analysis of learned rules. In: Proceedings of the AAAI Conference on Artificial Intelligence. 1996, 828–834
5 Liu B, Hsu W, Ma Y M. Pruning and summarizing the discovered associations. In: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1999, 125–134
https://doi.org/10.1145/312129.312216
6 Cao L B, Zhang C Q. Domain-driven, actionable knowledge discovery. IEEE Intelligent Systems, 2007, 22(4): 78–88
https://doi.org/10.1109/MIS.2007.67
7 Cao L B, Zhao Y C, Zhang H F, Luo D, Zhang C Q, Park E K. Flexible frameworks for actionable knowledge discovery. IEEE Transactions on Knowledge and Data Engineering, 2010, 22(9): 1299–1312
https://doi.org/10.1109/TKDE.2009.143
8 DeSarbo W S, Ramaswamy V. Crisp: customer response based iterative segmentation procedures for response modeling in direct marketing. Journal of Direct Marketing, 1994, 8(3): 7–20
https://doi.org/10.1002/dir.4000080304
9 Levin N, Zahavi J. Segmentation analysis with managerial judgment. Journal of Direct Marketing, 1996, 10(3): 28–47
https://doi.org/10.1002/(SICI)1522-7138(199622)10:3<28::AID-DIR3>3.0.CO;2-#
10 Moro S, Cortez P, Rita P. A data-driven approach to predict the success of bank telemarketing. Decision Support Systems, 2014, 62: 22–31
https://doi.org/10.1016/j.dss.2014.03.001
11 Hilderman R J, Hamilton H J. Applying objective interestingness measures in data mining systems. In: Proceedings of European Conference of Principles of Data Mining and Knowledge Discovery. 2000, 432–439
https://doi.org/10.1007/3-540-45372-5_47
12 Cao L B, Luo D, Zhang C Q. Knowledge actionability: satisfying technical and business interestingness. International Journal of Business Intelligence and Data Mining, 2007, 2(4): 496–514
https://doi.org/10.1504/IJBIDM.2007.016385
13 Cortez P, Embrechts M J. Using sensitivity analysis and visualization techniques to open black box data mining models. Information Sciences, 2013, 225: 1–17
https://doi.org/10.1016/j.ins.2012.10.039
14 Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I, Fergus R. Intriguing properties of neural networks. In: Proceedings of the International Conference on Learning Representations. 2014
15 Yang Q, Yin J, Ling C, Chen T. Postprocessing decision trees to extract actionable knowledge. In: Proceedings of the 3rd IEEE International Conference on Data Mining. 2003, 685–688
https://doi.org/10.1109/ICDM.2003.1251008
16 Yang Q, Yin J, Ling C, Pan R. Extracting actionable knowledge from decision trees. IEEE Transactions on Knowledge and Data Engineering, 2007, 19(1): 43–56
https://doi.org/10.1109/TKDE.2007.250584
17 Cui Z C, Chen W L, He Y J, Chen Y X. Optimal action extraction for random forests and boosted trees. In: Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2015, 179–188
https://doi.org/10.1145/2783258.2783281
18 Friedman J, Hastie T, Tibshirani R. The Elements of Statistical Learning, Vol 1. New York: Springer-Verlag, 2001
19 Shotton J, Sharp T, Kipman A, Fitzgibbon A, Finocchio M, Blake A, Cook M, Moore R. Real-time human pose recognition in parts from single depth images. Communications of the ACM, 2013, 56(1): 116–124
https://doi.org/10.1145/2398356.2398381
20 Viola P, Jones M J. Robust real-time face detection. International Journal of Computer Vision, 2004, 57(2): 137–154
https://doi.org/10.1023/B:VISI.0000013087.49260.fb
21 Mohan A, Chen Z, Weinberger K. Web-search ranking with initialized gradient boosted regression trees. Journal of Machine Learning Research, 2011, 14: 77–89
22 Lu Q, Cui Z C, Chen Y X, Chen X P. Extracting optimal actionable plans from additive tree models. Frontiers of Computer Science, 2017, 11(1): 160–173
https://doi.org/10.1007/s11704-016-5273-4
23 Freund Y, Schapire R E. A decision-theoretic generalization of online learning and an application to boosting. Journal of Computer and System Sciences, 1997, 55: 119–139
https://doi.org/10.1006/jcss.1997.1504
24 Friedman J H. Greedy function approximation: a gradient boosting machine. The Annals of Statistics, 2001, 29: 1189–1232
https://doi.org/10.1214/aos/1013203451
25 Breiman L. Random forests. Machine Learning, 2001, 45(1): 5–32
https://doi.org/10.1023/A:1010933404324
26 Fox M, Long D. PDDL2.1: an extension to PDDL for expressing temporal planning domains. Journal of Artificial Intelligence Research, 2003, 20: 61–124
27 Bäckström C, Nebel B. Complexity results for SAS+ planning. Computational Intelligence, 1995, 11(4): 625–655
https://doi.org/10.1111/j.1467-8640.1995.tb00052.x
28 Jonsson P, Bäckström C. State-variable planning under structural restrictions: algorithms and complexity. Artificial Intelligence, 1998, 100(1–2): 125–176
https://doi.org/10.1016/S0004-3702(98)00003-4
29 Helmert M. The fast downward planning system. Journal of Artificial Intelligence Research, 2006, 26: 191–246
30 Kautz H A, Selman B. Planning as satisfiability. In: Proceedings of European Conference on Artificial Intelligence. 1992, 359–363
31 Blum A, Furst M L. Fast planning through planning graph analysis. Artificial Intelligence, 1997, 90(1–2): 281–300
https://doi.org/10.1016/S0004-3702(96)00047-1
32 Lu Q, Huang R Y, Chen Y X, Xu Y, Zhang W X, Chen G L. A SATbased approach to cost-sensitive temporally expressive planning. ACM Transactions on Intelligent Systems and Technology, 2014, 5(1): 18
33 Huang R Y, Chen Y X, Zhang W X. A novel transition based encoding scheme for planning as satisfiability. In: Proceedings of the AAAI Conference on Artificial Intelligence. 2010, 89–94
34 Huang R Y, Chen Y X, Zhang W X. SAS+ planning as satisfiability. Journal of Artificial Intelligence Research, 2012, 43: 293–328
35 Balyo T, Chrpa L, Kilani A. On different strategies for eliminating redundant actions from plans. In: Proceedings of the 7th Annual Symposium on Combinatorial Search. 2014, 10–18
[1] Xia-an BI, Yiming XIE, Hao WU, Luyun XU. Identification of differential brain regions in MCI progression via clustering-evolutionary weighted SVM ensemble algorithm[J]. Front. Comput. Sci., 2021, 15(6): 156903-.
[2] Yan-Ping SUN, Min-Ling ZHANG. Compositional metric learning for multi-label classification[J]. Front. Comput. Sci., 2021, 15(5): 155320-.
[3] Jian SUN, Pu-Feng DU. Predicting protein subchloroplast locations: the 10th anniversary[J]. Front. Comput. Sci., 2021, 15(2): 152901-.
[4] Jianpeng HU, Linpeng HUANG, Tianqi SUN, Ying FAN, Wenqiang HU, Hao ZHONG. Proactive planning of bandwidth resource using simulation-based what-if predictions forWeb services in the cloud[J]. Front. Comput. Sci., 2021, 15(1): 151201-.
[5] Syed Farooq ALI, Muhammad Aamir KHAN, Ahmed Sohail ASLAM. Fingerprint matching, spoof and liveness detection: classification and literature review[J]. Front. Comput. Sci., 2021, 15(1): 151310-.
[6] Huiping LIU, Cheqing JIN, Aoying ZHOU. Popular route planning with travel cost estimation from trajectories[J]. Front. Comput. Sci., 2020, 14(1): 191-207.
[7] Xu-Ying LIU, Sheng-Tao WANG, Min-Ling ZHANG. Transfer synthetic over-sampling for class-imbalance learning with limited minority class data[J]. Front. Comput. Sci., 2019, 13(5): 996-1009.
[8] Yu-Feng LI, De-Ming LIANG. Safe semi-supervised learning: a brief introduction[J]. Front. Comput. Sci., 2019, 13(4): 669-676.
[9] Wenhao ZHENG, Hongyu ZHOU, Ming LI, Jianxin WU. CodeAttention: translating source code to comments by exploiting the code constructs[J]. Front. Comput. Sci., 2019, 13(3): 565-578.
[10] Hao SHAO. Query by diverse committee in transfer active learning[J]. Front. Comput. Sci., 2019, 13(2): 280-291.
[11] Qingying SUN, Zhongqing WANG, Shoushan LI, Qiaoming ZHU, Guodong ZHOU. Stance detection via sentiment information and neural network model[J]. Front. Comput. Sci., 2019, 13(1): 127-138.
[12] Ruochen HUANG, Xin WEI, Liang ZHOU, Chaoping LV, Hao MENG, Jiefeng JIN. A survey of data-driven approach on multimedia QoE evaluation[J]. Front. Comput. Sci., 2018, 12(6): 1060-1075.
[13] Ashish Kumar DWIVEDI, Anand TIRKEY, Santanu Kumar RATH. Software design pattern mining using classification-based techniques[J]. Front. Comput. Sci., 2018, 12(5): 908-922.
[14] Bo SUN, Haiyan CHEN, Jiandong WANG, Hua XIE. Evolutionary under-sampling based bagging ensemble method for imbalanced data classification[J]. Front. Comput. Sci., 2018, 12(2): 331-350.
[15] Min-Ling ZHANG, Yu-Kun LI, Xu-Ying LIU, Xin GENG. Binary relevance for multi-label learning: an overview[J]. Front. Comput. Sci., 2018, 12(2): 191-202.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed