Achieving data-driven actionability by combining learning and planning

doi:10.1007/s11704-017-6315-2

Front. Comput. Sci.

2018, Vol. 12

Issue (5) : 939-949 https://doi.org/10.1007/s11704-017-6315-2

RESEARCH ARTICLE

Achieving data-driven actionability by combining learning and planning

Qiang LV(

), Yixin CHEN, Zhaorong LI, Zhicheng CUI, Ling CHEN, Xing ZHANG, Haihua SHEN(

)

¹. College of Information Engineering, Yangzhou University, Yangzhou 225127, China
². Department of Computer Science and Engineering,Washington University in St. Louis, St. Louis MO 63130, USA
³. School of Management, Fudan University, Shanghai 200433, China
⁴. School of Computer and Control Engineering, University of Chinese Academy of Science, Beijing 100049, China

Download: PDF(437 KB)
Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks

Abstract

A main focus of machine learning research has been improving the generalization accuracy and efficiency of prediction models. However, what emerges as missing in many applications is actionability, i.e., the ability to turn prediction results into actions. Existing effort in deriving such actionable knowledge is few and limited to simple action models while in many real applications those models are often more complex and harder to extract an optimal solution. In this paper, we propose a novel approach that achieves actionability by combining learning with planning, two core areas of AI. In particular, we propose a framework to extract actionable knowledge from random forest, one of the most widely used and best off-the-shelf classifiers. We formulate the actionability problem to a sub-optimal action planning (SOAP) problem, which is to find a plan to alter certain features of a given input so that the random forest would yield a desirable output, while minimizing the total costs of actions. Technically, the SOAP problem is formulated in the SAS+ planning formalism, and solved using a Max-SAT based approach. Our experimental results demonstrate the effectiveness and efficiency of the proposed approach on a personal credit dataset and other benchmarks. Our work represents a new application of automated planning on an emerging and challenging machine learning paradigm.

Keywords actionable knowledge extraction machine learning planning random forest

Corresponding Author(s): Qiang LV,Haihua SHEN

Just Accepted Date: 05 January 2017 Online First Date: 06 March 2018 Issue Date: 21 September 2018

Cite this article:

Qiang LV,Yixin CHEN,Zhaorong LI, et al. Achieving data-driven actionability by combining learning and planning[J]. Front. Comput. Sci., 2018, 12(5): 939-949.

URL:

https://academic.hep.com.cn/fcs/EN/10.1007/s11704-017-6315-2
https://academic.hep.com.cn/fcs/EN/Y2018/V12/I5/939

1	Mitchell T M. Machine learning and data mining. Communications of the ACM, 1999, 42(11): 30–36 https://doi.org/10.1145/319382.319388
2	Bailey T C, Chen Y X, Mao Y, Lu C Y, Hackmann G, Micek S T, Heard K M, Faulkner K M, Kollef M H. A trial of a real-time alert for clinical deterioration in patients hospitalized on general medical wards. Journal of Hospital Medicine, 2013, 8: 236–242 https://doi.org/10.1002/jhm.2009
3	Johnson R A, Gong R, Greatorex-Voith S, Anand A, Fritzler A. A datadriven framework for identifying high school students at risk of not graduating on time. Bloomberg Data for Good Exchange, 2015
4	Liu B, Hsu W. Post-analysis of learned rules. In: Proceedings of the AAAI Conference on Artificial Intelligence. 1996, 828–834
5	Liu B, Hsu W, Ma Y M. Pruning and summarizing the discovered associations. In: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1999, 125–134 https://doi.org/10.1145/312129.312216
6	Cao L B, Zhang C Q. Domain-driven, actionable knowledge discovery. IEEE Intelligent Systems, 2007, 22(4): 78–88 https://doi.org/10.1109/MIS.2007.67
7	Cao L B, Zhao Y C, Zhang H F, Luo D, Zhang C Q, Park E K. Flexible frameworks for actionable knowledge discovery. IEEE Transactions on Knowledge and Data Engineering, 2010, 22(9): 1299–1312 https://doi.org/10.1109/TKDE.2009.143
8	DeSarbo W S, Ramaswamy V. Crisp: customer response based iterative segmentation procedures for response modeling in direct marketing. Journal of Direct Marketing, 1994, 8(3): 7–20 https://doi.org/10.1002/dir.4000080304
9	Levin N, Zahavi J. Segmentation analysis with managerial judgment. Journal of Direct Marketing, 1996, 10(3): 28–47 https://doi.org/10.1002/(SICI)1522-7138(199622)10:3<28::AID-DIR3>3.0.CO;2-#
10	Moro S, Cortez P, Rita P. A data-driven approach to predict the success of bank telemarketing. Decision Support Systems, 2014, 62: 22–31 https://doi.org/10.1016/j.dss.2014.03.001
11	Hilderman R J, Hamilton H J. Applying objective interestingness measures in data mining systems. In: Proceedings of European Conference of Principles of Data Mining and Knowledge Discovery. 2000, 432–439 https://doi.org/10.1007/3-540-45372-5_47
12	Cao L B, Luo D, Zhang C Q. Knowledge actionability: satisfying technical and business interestingness. International Journal of Business Intelligence and Data Mining, 2007, 2(4): 496–514 https://doi.org/10.1504/IJBIDM.2007.016385
13	Cortez P, Embrechts M J. Using sensitivity analysis and visualization techniques to open black box data mining models. Information Sciences, 2013, 225: 1–17 https://doi.org/10.1016/j.ins.2012.10.039
14	Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I, Fergus R. Intriguing properties of neural networks. In: Proceedings of the International Conference on Learning Representations. 2014
15	Yang Q, Yin J, Ling C, Chen T. Postprocessing decision trees to extract actionable knowledge. In: Proceedings of the 3rd IEEE International Conference on Data Mining. 2003, 685–688 https://doi.org/10.1109/ICDM.2003.1251008
16	Yang Q, Yin J, Ling C, Pan R. Extracting actionable knowledge from decision trees. IEEE Transactions on Knowledge and Data Engineering, 2007, 19(1): 43–56 https://doi.org/10.1109/TKDE.2007.250584
17	Cui Z C, Chen W L, He Y J, Chen Y X. Optimal action extraction for random forests and boosted trees. In: Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2015, 179–188 https://doi.org/10.1145/2783258.2783281
18	Friedman J, Hastie T, Tibshirani R. The Elements of Statistical Learning, Vol 1. New York: Springer-Verlag, 2001
19	Shotton J, Sharp T, Kipman A, Fitzgibbon A, Finocchio M, Blake A, Cook M, Moore R. Real-time human pose recognition in parts from single depth images. Communications of the ACM, 2013, 56(1): 116–124 https://doi.org/10.1145/2398356.2398381
20	Viola P, Jones M J. Robust real-time face detection. International Journal of Computer Vision, 2004, 57(2): 137–154 https://doi.org/10.1023/B:VISI.0000013087.49260.fb
21	Mohan A, Chen Z, Weinberger K. Web-search ranking with initialized gradient boosted regression trees. Journal of Machine Learning Research, 2011, 14: 77–89
22	Lu Q, Cui Z C, Chen Y X, Chen X P. Extracting optimal actionable plans from additive tree models. Frontiers of Computer Science, 2017, 11(1): 160–173 https://doi.org/10.1007/s11704-016-5273-4
23	Freund Y, Schapire R E. A decision-theoretic generalization of online learning and an application to boosting. Journal of Computer and System Sciences, 1997, 55: 119–139 https://doi.org/10.1006/jcss.1997.1504
24	Friedman J H. Greedy function approximation: a gradient boosting machine. The Annals of Statistics, 2001, 29: 1189–1232 https://doi.org/10.1214/aos/1013203451
25	Breiman L. Random forests. Machine Learning, 2001, 45(1): 5–32 https://doi.org/10.1023/A:1010933404324
26	Fox M, Long D. PDDL2.1: an extension to PDDL for expressing temporal planning domains. Journal of Artificial Intelligence Research, 2003, 20: 61–124
27	Bäckström C, Nebel B. Complexity results for SAS+ planning. Computational Intelligence, 1995, 11(4): 625–655 https://doi.org/10.1111/j.1467-8640.1995.tb00052.x
28	Jonsson P, Bäckström C. State-variable planning under structural restrictions: algorithms and complexity. Artificial Intelligence, 1998, 100(1–2): 125–176 https://doi.org/10.1016/S0004-3702(98)00003-4
29	Helmert M. The fast downward planning system. Journal of Artificial Intelligence Research, 2006, 26: 191–246
30	Kautz H A, Selman B. Planning as satisfiability. In: Proceedings of European Conference on Artificial Intelligence. 1992, 359–363
31	Blum A, Furst M L. Fast planning through planning graph analysis. Artificial Intelligence, 1997, 90(1–2): 281–300 https://doi.org/10.1016/S0004-3702(96)00047-1
32	Lu Q, Huang R Y, Chen Y X, Xu Y, Zhang W X, Chen G L. A SATbased approach to cost-sensitive temporally expressive planning. ACM Transactions on Intelligent Systems and Technology, 2014, 5(1): 18
33	Huang R Y, Chen Y X, Zhang W X. A novel transition based encoding scheme for planning as satisfiability. In: Proceedings of the AAAI Conference on Artificial Intelligence. 2010, 89–94
34	Huang R Y, Chen Y X, Zhang W X. SAS+ planning as satisfiability. Journal of Artificial Intelligence Research, 2012, 43: 293–328
35	Balyo T, Chrpa L, Kilani A. On different strategies for eliminating redundant actions from plans. In: Proceedings of the 7th Annual Symposium on Combinatorial Search. 2014, 10–18

[1]

Download

[1]	Xia-an BI, Yiming XIE, Hao WU, Luyun XU. Identification of differential brain regions in MCI progression via clustering-evolutionary weighted SVM ensemble algorithm[J]. Front. Comput. Sci., 2021, 15(6): 156903-.
[2]	Yan-Ping SUN, Min-Ling ZHANG. Compositional metric learning for multi-label classification[J]. Front. Comput. Sci., 2021, 15(5): 155320-.
[3]	Jian SUN, Pu-Feng DU. Predicting protein subchloroplast locations: the 10th anniversary[J]. Front. Comput. Sci., 2021, 15(2): 152901-.
[4]	Jianpeng HU, Linpeng HUANG, Tianqi SUN, Ying FAN, Wenqiang HU, Hao ZHONG. Proactive planning of bandwidth resource using simulation-based what-if predictions forWeb services in the cloud[J]. Front. Comput. Sci., 2021, 15(1): 151201-.
[5]	Syed Farooq ALI, Muhammad Aamir KHAN, Ahmed Sohail ASLAM. Fingerprint matching, spoof and liveness detection: classification and literature review[J]. Front. Comput. Sci., 2021, 15(1): 151310-.
[6]	Huiping LIU, Cheqing JIN, Aoying ZHOU. Popular route planning with travel cost estimation from trajectories[J]. Front. Comput. Sci., 2020, 14(1): 191-207.
[7]	Xu-Ying LIU, Sheng-Tao WANG, Min-Ling ZHANG. Transfer synthetic over-sampling for class-imbalance learning with limited minority class data[J]. Front. Comput. Sci., 2019, 13(5): 996-1009.
[8]	Yu-Feng LI, De-Ming LIANG. Safe semi-supervised learning: a brief introduction[J]. Front. Comput. Sci., 2019, 13(4): 669-676.
[9]	Wenhao ZHENG, Hongyu ZHOU, Ming LI, Jianxin WU. CodeAttention: translating source code to comments by exploiting the code constructs[J]. Front. Comput. Sci., 2019, 13(3): 565-578.
[10]	Hao SHAO. Query by diverse committee in transfer active learning[J]. Front. Comput. Sci., 2019, 13(2): 280-291.
[11]	Qingying SUN, Zhongqing WANG, Shoushan LI, Qiaoming ZHU, Guodong ZHOU. Stance detection via sentiment information and neural network model[J]. Front. Comput. Sci., 2019, 13(1): 127-138.
[12]	Ruochen HUANG, Xin WEI, Liang ZHOU, Chaoping LV, Hao MENG, Jiefeng JIN. A survey of data-driven approach on multimedia QoE evaluation[J]. Front. Comput. Sci., 2018, 12(6): 1060-1075.
[13]	Ashish Kumar DWIVEDI, Anand TIRKEY, Santanu Kumar RATH. Software design pattern mining using classification-based techniques[J]. Front. Comput. Sci., 2018, 12(5): 908-922.
[14]	Bo SUN, Haiyan CHEN, Jiandong WANG, Hua XIE. Evolutionary under-sampling based bagging ensemble method for imbalanced data classification[J]. Front. Comput. Sci., 2018, 12(2): 331-350.
[15]	Min-Ling ZHANG, Yu-Kun LI, Xu-Ying LIU, Xin GENG. Binary relevance for multi-label learning: an overview[J]. Front. Comput. Sci., 2018, 12(2): 191-202.

Viewed

Full text

Abstract

Cited

Shared

Discussed