Please wait a minute...
Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

Postal Subscription Code 80-970

2018 Impact Factor: 1.129

Front. Comput. Sci.    2014, Vol. 8 Issue (2) : 255-264    https://doi.org/10.1007/s11704-014-3038-5
RESEARCH ARTICLE
Naive Bayes for value difference metric
Chaoqun LI1,*(),Liangxiao JIANG2,Hongwei LI1
1. Department of Mathematics, China University of Geosciences,Wuhan 430074, China
2. Department of Computer Science, China University of Geosciences,Wuhan 430074, China
 Download: PDF(309 KB)  
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

The value difference metric (VDM) is one of the best-known and widely used distance functions for nominal attributes. This work applies the instanceweighting technique to improveVDM. An instance weighted value difference metric (IWVDM) is proposed here. Different from prior work, IWVDM uses naive Bayes (NB) to find weights for training instances. Because early work has shown that there is a close relationship between VDM and NB, some work on NB can be applied to VDM. The weight of a training instance x, that belongs to the class c, is assigned according to the difference between the estimated conditional probability P^(c|x) by NB and the true conditional probability P(c|x), and the weight is adjusted iteratively. Compared with previous work, IWVDM has the advantage of reducing the time complexity of the process of finding weights, and simultaneously improving the performance of VDM. Experimental results on 36 UCI datasets validate the effectiveness of IWVDM.

Keywords value difference metric      instance weighting      naive Bayes      distance-based learning algorithms     
Corresponding Author(s): Chaoqun LI   
Issue Date: 24 June 2014
 Cite this article:   
Chaoqun LI,Liangxiao JIANG,Hongwei LI. Naive Bayes for value difference metric[J]. Front. Comput. Sci., 2014, 8(2): 255-264.
 URL:  
https://academic.hep.com.cn/fcs/EN/10.1007/s11704-014-3038-5
https://academic.hep.com.cn/fcs/EN/Y2014/V8/I2/255
1 CoverT M, HartP E. Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 1967, 13: 21-27
doi: 10.1109/TIT.1967.1053964
2 AhaD, KiblerD, AlbertMK. Instance-based learning algorithms. Machine Learning, 1991, 6: 37-66
doi: 10.1007/BF00153759
3 DomingosP. Rule induction and instance-based learning: a unified approach. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence. 1995, 1226-1232
4 FrankE, HallM, PfahringerB. Locally weighted naive bayes. In: Proceedings of the 19th Conference on Uncertainty in Artificial Intelligence. 2003, 249-256
5 MitchellT M. Machine Learning, 1st edition. McGraw-Hill, 1997
6 TanP N, SteinbachM, KumarV. Introduction to Data Mining, 1st edition. Pearson Education, 2006
7 StanfillC, WaltzD. Toward memory-based reasoning. Communications of the ACM, 1986, 29: 1213-1228
doi: 10.1145/7902.7906
8 ShortR D, FukunagaK. The optimal distance measure for nearest neighbour classification. IEEE Transactions on Information Theory, 1981, 27: 622-627
doi: 10.1109/TIT.1981.1056403
9 MylesJ P, HandD J. The multi-class metric problem in nearest neighbour discrimination rules. Pattern Recognition, 1990, 23: 1291-1297
doi: 10.1016/0031-3203(90)90123-3
10 CostS, SalzbergS. A weighted nearest neighbor algorithm for learning with symbolic features. Machine Learning, 1993, 10: 57-78
doi: 10.1007/BF00993481
11 WilsonD R, MartinezT R. Improved heterogeneous distance functions. Journal of Artificial Intelligence Research, 1997, 6: 1-34
12 LiC, LiH. One dependence value difference metric. Knowledge-Based Systems, 2011, 24: 589-594
doi: 10.1016/j.knosys.2011.01.005
13 JiangL, LiC. An augmented value difference measure. Pattern Recognition Letters, 2013, 34: 1169-1174
doi: 10.1016/j.patrec.2013.03.030
14 DerracJ, TrigueroI, GarcíaS, HerreraF. Integrating instance selection, instance weighting, and feature weighting for nearest neighbor classifiers by coevolutionary algorithms. IEEE Transactions On Systems, Man, and Cybernetics-Part B: Cybernetics, 2012, 42: 1383-1397
doi: 10.1109/TSMCB.2012.2191953
15 García-PedrajasN. Constructing ensembles of classifiers by means of weighted instance selection. IEEE Transactions On Neural Network, 2009, 20: 258-277
doi: 10.1109/TNN.2008.2005496
16 JahromiM Z, ParvinniaE, JohnR. A method of learning weighted similarity function to improve the performance of nearest neighbor. Information Sciences, 2009, 179: 2964-2973
doi: 10.1016/j.ins.2009.04.012
17 WangJ, NeskovicP, CooperL. Improving nearest neighbor rule with a simple adaptive distance measure. Pattern Recognition Letters, 2007, 28: 207-213
doi: 10.1016/j.patrec.2006.07.002
18 KasifS, SalzbergS, WaltzD, RachlinD J. A. A probabilistic framework for memory-based reasoning. Artificial Intelligence, 1998, 104: 287-311
doi: 10.1016/S0004-3702(98)00046-0
19 TianY, ShiY, LiuX. Recent advances on support vector machines research. Technological and Economic Development of Economy, 2012, 18: 5-33
doi: 10.3846/20294913.2012.661205
20 QiZ, TianY, ShiY. Laplacian twin support vector machine for semisupervised classification. Neural Networks, 2012, 35: 46-53
doi: 10.1016/j.neunet.2012.07.011
21 QiZ, TianY, ShiY. Structural twin support vector machine for classification. Knowledge-Based Systems, 2013, 43: 74-81
doi: 10.1016/j.knosys.2013.01.008
22 Saar-TsechanskyM, ProvostF. Active sampling for class probability estimation and ranking. Machine Learning, 2004, 54: 153-178
doi: 10.1023/B:MACH.0000011806.12374.c3
23 JiangL, ZhangH. Learning naive bayes for probability estimation by feature selection. In: Proceedings of the 19th Canadian Conference on Artificial Intelligence. 2006, 503-514
24 GrossmanD, DomingosP. Learning bayesian network classifiers by maximizing conditional likelihood. In: Proceedings of the 21st International Conference on Machine Learning. 2004, 361-368
25 LowdD, DomingosP. Naive bayes models for probability estimation. In: Proceedings of the 22nd International Conference on Machine Learning. 2005, 529-536
26 HallM. A decision tree-based attribute weighting filter for naive bayes. Knowledge-Based Systems, 2007, 20: 120-126
doi: 10.1016/j.knosys.2006.11.008
27 WittenI H, FrankE. Data Mining: Practical Machine Learning Tools and Techniques, 2nd edition. San Francisco: Morgan Kaufmann, 2005
28 WilsonD, MartinezT. Reduction techniques for exemplar-based learning algorithms. Machine Learning, 2000, 38: 257-286
doi: 10.1023/A:1007626913721
29 BrightonH, MellishC. Advances in instance selection for instancebased learning algorithms. Data Mining and Knowledge Discovery, 2002, 6: 153-172
doi: 10.1023/A:1014043630878
30 ParedesR, VidalE. Learning prototypes and distances: a prototype reduction technique based on nearest neighbor error minimization. Pattern Recognition, 2006, 39: 180-188
doi: 10.1016/j.patcog.2005.06.001
31 GarcíaS, DerracJ, CanoJ R, HerreraF. Prototype selection for nearest neighbor classification: taxonomy and empirical study. IEEE Transactions On Pattern Analaysis and Machine Intelligence, 2012, 34: 417-435
doi: 10.1109/TPAMI.2011.142
32 TrigueroI, DerracJ, GarcíaS, HerreraF. A taxonomy and experimental study on prototype generation for nearest neighbor classification. IEEE Transactions On Systems, Man, and Cybernetics, Part C: Applications and Reviews, 2012, 42: 86-100
doi: 10.1109/TSMCC.2010.2103939
33 WuX, KumarV, QuinlanJ R, GhoshJ, YangQ, MotodaH, MclachlanG J, NgA, LiuB, YuP S, ZhouZ H, SteinbachM, HandD J, SteinbergD. Top 10 algorithms in data mining. Knowledge and Information Systems, 2008, 14: 1-37
doi: 10.1007/s10115-007-0114-2
34 JiangL, WangD, CaiZ. Discriminatively weighted naive bayes and its application in text classification. International Journal on Artificial Intelligence Tools, 2012, 21(1): 1250007:1-1250007:19
35 HallM A. Correlation-based feature selection for discrete and numeric class machine learning. In: Proceedings of the 17th International Conference on Machine Learning. 2000, 359-366
36 JiangL, ZhangH, CaiZ. A novel bayes model: hidden naive bayes. IEEE Transactions on Knowledge and Data Engineering, 2009, 21: 1361-1371
doi: 10.1109/TKDE.2008.234
37 NadeauC, BengioY. Inference for the generalization error. Machine Learning, 2003, 52: 239-281
doi: 10.1023/A:1024068626366
38 KurganL A, CiosK J, TadeusiewiczR, OgielaM, GoodendayL S. Knowledge discovery approach to automated cardiac spect diagnosis. Artificial Intelligence in Medicine, 2001, 23: 149-169
doi: 10.1016/S0933-3657(01)00082-3
[1] Zengchang QIN, Tao WAN. Hybrid Bayesian estimation tree learning with discrete and fuzzy labels[J]. Front Comput Sci, 2013, 7(6): 852-863.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed