|
|
Extracting optimal actionable plans from additive tree models |
Qiang LU1,3( ),Zhicheng CUI2,Yixin CHEN2,Xiaoping CHEN3 |
1. College of Information Engineering, Yangzhou University, Yangzhou 225127, China 2. Department of Computer Science and Engineering,Washington University in St. Louis, St. Louis MO 63130, USA 3. School of Computer Science and Technology, University of Science and Technology of China, Hefei 230026, China |
|
|
Abstract Although amazing progress has been made in machine learning to achieve high generalization accuracy and efficiency, there is still very limited work on deriving meaningful decision-making actions from the resulting models. However, in many applications such as advertisement, recommendation systems, social networks, customer relationship management, and clinical prediction, the users need not only accurate prediction, but also suggestions on actions to achieve a desirable goal (e.g., high ads hit rates) or avert an undesirable predicted result (e.g., clinical deterioration). Existing works for extracting such actionability are few and limited to simple models such as a decision tree. The dilemma is that those models with high accuracy are often more complex and harder to extract actionability from. In this paper, we propose an effective method to extract actionable knowledge from additive tree models (ATMs), one of the most widely used and best off-the-shelf classifiers. We rigorously formulate the optimal actionable planning (OAP) problem for a given ATM, which is to extract an actionable plan for a given input so that it can achieve a desirable output while maximizing the net profit. Based on a state space graph formulation, we first propose an optimal heuristic search method which intends to find an optimal solution. Then, we also present a sub-optimal heuristic search with an admissible and consistent heuristic function which can remarkably improve the efficiency of the algorithm. Our experimental results demonstrate the effectiveness and efficiency of the proposed algorithms on several real datasets in the application domain of personal credit and banking.
|
Keywords
actionable knowledge extraction
machine learning
additive tree models
state space search
|
Corresponding Author(s):
Qiang LU
|
Just Accepted Date: 18 January 2016
Online First Date: 25 July 2016
Issue Date: 11 January 2017
|
|
1 |
Mao Y, Chen W L, Chen Y X, Lu C Y, Kollef M, Bailey T. An integrated data mining approach to real-time clinical monitoring and deterioration warning. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2012, 1140–1148
https://doi.org/10.1145/2339530.2339709
|
2 |
Bailey T C, Chen Y X, Mao Y, Lu C Y, Hackmann G, Micek S T, Heard K M, Faulkner K M, Kollef M H. A trial of a real-time alert for clinical deterioration in patients hospitalized on general medical wards. Journal of Hospital Medicine, 2013, 8(5): 236–242
https://doi.org/10.1002/jhm.2009
|
3 |
Cortez P, Embrechts M J. Using sensitivity analysis and visualization techniques to open black box data mining models. Information Sciences, 2013, 225: 1–17
https://doi.org/10.1016/j.ins.2012.10.039
|
4 |
Moro S, Cortez P, Rita P. A data-driven approach to predict the success of bank telemarketing. Decision Support Systems, 2014, 62: 22–31
https://doi.org/10.1016/j.dss.2014.03.001
|
5 |
Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I, Fergus R. Intriguing properties of neural networks. 2013, arXiv preprint arXiv:1312.6199
|
6 |
Friedman J, Hastie T, Tibshirani R. The elements of statistical learning. Volume 1. Springer Series in Statistics Springer, 2001
|
7 |
Shotton J, Sharp T, Kipman A, Fitzgibbon A, Finocchio M, Blake A, Cook M, Moore R. Real-time human pose recognition in parts from single depth images. Communications of the ACM, 2013, 56(1): 116–124
https://doi.org/10.1145/2398356.2398381
|
8 |
Viola P, Jones M J. Robust real-time face detection. International Journal of Computer Vision, 2004, 57(2): 137–154
https://doi.org/10.1023/B:VISI.0000013087.49260.fb
|
9 |
Mohan A, Chen Z, Weinberger K Q. Web-search ranking with initialized gradient boosted regression trees. Journal of Machine Learning Research, Workshop and Conference Proceedings, 2011, 14: 77–89
|
10 |
Breiman L. Random forests. Machine Learning, 2001, 45(1): 5–32
https://doi.org/10.1023/A:1010933404324
|
11 |
Freund Y, Schapire R E. A decision-theoretic generalization of online learning and an application to boosting. Journal of Computer and System Sciences, 1997, 55(1): 119–139
https://doi.org/10.1006/jcss.1997.1504
|
12 |
Friedman J H. Greedy function approximation: a gradient boosting machine. The Annals of Statistics, 2001, 29: 1189–1232
https://doi.org/10.1214/aos/1013203451
|
13 |
Yang Q, Yin J, Ling C X, Chen T. Postprocessing decision trees to extract actionable knowledge. In: Proceedings of the 3rd IEEE International Conference on Data Mining. 2003, 685–688
https://doi.org/10.1109/ICDM.2003.1251008
|
14 |
Manindra A, Thomas T. Satisfiability Problems. Technical Report. 2000
|
15 |
Cai S W. Balance between complexity and quality: local search for minimum vertex cover in massive graphs. In: Proceedings of the 24th International Joint Conference on Artificial Intelligence. 2015, 747–753
|
16 |
Russel S, Norvig P. Artificial Intelligence: A Modern Approach. 2nd Ed. Upper Saddle River: Prentice-Hall, 2003
|
17 |
Bache K, Lichman M. UCI Machine Learning Repository. Technical Report. 2013
|
18 |
Kohavi R. Scaling up the accuracy of naive-bayes classifiers: a decision-tree hybrid. In: Proceedings of the International Conference on Knowledge Discovery and Data Mining. 1996, 202–207
|
19 |
Cui Z C, Chen W L, He Y J, Chen Y X. Optimal action extraction for random forests and boosted trees. In: Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2015, 179–188
https://doi.org/10.1145/2783258.2783281
|
20 |
DeSarbo W S, Ramaswamy V. Crisp: customer response based iterative segmentation procedures for response modeling in direct marketing. Journal of Direct Marketing, 1994, 8(3): 7–20
https://doi.org/10.1002/dir.4000080304
|
21 |
Levin N, Zahavi J. Segmentation analysis with managerial judgment. Journal of Direct Marketing, 1996, 10(3): 28–47
https://doi.org/10.1002/(SICI)1522-7138(199622)10:3<28::AID-DIR3>3.0.CO;2-#
|
22 |
Hilderman R J, Hamilton H J. Applying objective interestingness mea sures in data mining systems. In: Proceedings of the European Symposium on Principles of Data Mining and Knowledge Discovery. 2000, 432–439
https://doi.org/10.1007/3-540-45372-5_47
|
23 |
Cao L B, Luo D, Zhang C Q. Knowledge actionability: satisfying technical and business interestingness. International Journal of Business Intelligence and Data Mining, 2007, 2(4): 496–514
https://doi.org/10.1504/IJBIDM.2007.016385
|
24 |
Liu B, Hsu W. Post-analysis of learned rules. In: Proceedings of the National Conference on Artificial Intelligence. 1996, 828–834
|
25 |
Liu B, Hsu W, Ma Y. Pruning and summarizing the discovered associations. In: Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1999, 125–134
https://doi.org/10.1145/312129.312216
|
26 |
Cao L B, Zhang C Q, Yang Q, Bell D, Vlachos M, Taneri B, Keogh E, Yu P S, Zhong N, Ashrafi M Z, Taniar D, Dubossarsky E, Graco W. Domain-driven, actionable knowledge discovery. IEEE Intelligent Systems, 2007, 22(4): 78–88
https://doi.org/10.1109/MIS.2007.67
|
27 |
Cao L B, Zhao Y C, Zhang H F, Luo D, Zhang C Q, Park E K. Flexible frameworks for actionable knowledge discovery. IEEE Transactions on Knowledge and Data Engineering, 2010, 22(9): 1299–1312
https://doi.org/10.1109/TKDE.2009.143
|
28 |
Yang Q, Yin J, Ling C, Pan R. Extracting actionable knowledge from decision trees. IEEE Transactions on Knowledge and Data Engineering, 2007, 19(1): 43–56
https://doi.org/10.1109/TKDE.2007.250584
|
29 |
Zhou Z H, Jiang Y. Nec4. 5: neural ensemble based c4. 5. IEEE Transactions on Knowledge and Data Engineering, 2004, 16(6): 770–773
https://doi.org/10.1109/TKDE.2004.11
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|