Please wait a minute...
Frontiers of Electrical and Electronic Engineering

ISSN 2095-2732

ISSN 2095-2740(Online)

CN 10-1028/TM

Front Elect Electr Eng Chin    0, Vol. Issue () : 256-274    https://doi.org/10.1007/s11460-011-0150-2
RESEARCH ARTICLE
Parameterizations make different model selections: Empirical findings from factor analysis
Shikui TU, Lei XU()
Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China
 Download: PDF(1216 KB)   HTML
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

How parameterizations affect model selection performance is an issue that has been ignored or seldom studied since traditional model selection criteria, such as Akaike’s information criterion (AIC), Schwarz’s Bayesian information criterion (BIC), difference of negative log-likelihood (DNLL), etc., perform equivalently on different parameterizations that have equivalent likelihood functions. For factor analysis (FA), in addition to one traditional model (shortly denoted by FA-a), it was previously found that there is another parameterization (shortly denoted by FA-b) and the Bayesian Ying-Yang (BYY) harmony learning gets different model selection performances on FA-a and FA-b. This paper investigates a family of FA parameterizations that have equivalent likelihood functions, where each one (shortly denoted by FA-r) is featured by an integer r, with FA-a as one end that r = 0 and FA-b as the other end that r reaches its upper-bound. In addition to the BYY learning in comparison with AIC, BIC, and DNLL, we also implement variational Bayes (VB). Several empirical finds have been obtained via extensive experiments. First, both BYY and VB perform obviously better on FA-b than on FA-a, and this superiority of FA-b is reliable and robust. Second, both BYY and VB outperform AIC, BIC, and DNLL, while BYY further outperforms VB considerably, especially on FA-b. Moreover, with FA-a replaced by FA-b, the gain obtained by BYY is obviously higher than the one by VB, while the gain by VB is better than no gain by AIC, BIC, and DNLL. Third, this paper also demonstrates how each part of priors incrementally and jointly improves the performances, and further shows that using VB to optimize the hyperparameters of priors deteriorates the performances while using BYY for this purpose can further improve the performances.

Keywords model selection      factor analysis      parameterizations      maximum likelihood      variational Bayes      Bayesian Ying-Yang learning     
Corresponding Author(s): XU Lei,Email:lxu@cse.cuhk.edu.hk   
Issue Date: 05 June 2011
 Cite this article:   
Shikui TU,Lei XU. Parameterizations make different model selections: Empirical findings from factor analysis[J]. Front Elect Electr Eng Chin, 0, (): 256-274.
 URL:  
https://academic.hep.com.cn/fee/EN/10.1007/s11460-011-0150-2
https://academic.hep.com.cn/fee/EN/Y0/V/I/256
1 Akaike H. A new look at the statistical model identification. IEEE Transactions on Automatic Control , 1974, 19(6): 716-723
doi: 10.1109/TAC.1974.1100705
2 Schwarz G. Estimating the dimension of a model. Annals of Statistics , 1978, 6(2): 461-464
doi: 10.1214/aos/1176344136
3 Rissanen J. Modelling by the shortest data description. Automatica , 1978, 14(5): 465-471
doi: 10.1016/0005-1098(78)90005-5
4 Anderson T, Rubin H. Statistical inference in factor analysis. In: Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability . 1956, 5: 111-150
5 Fodor I K. A survey of dimension reduction techniques. Technical Report UCRL-ID-148494 . 2002
6 Jolliffe I T. Principal Component Analysis. 2nd ed. New York: Springer, 2002
7 Tipping M E, Bishop C M. Mixtures of probabilistic principal component analyzers. Neural Computation , 1999, 11(2): 443-482
doi: 10.1162/089976699300016728
8 Bishop C M. Variational principal components. In: Proceedings of the Ninth International Conference on Artificial Neural Networks . 1999, 509-514
doi: 10.1049/cp:19991160
9 Ghahramani Z, Beal M J. Variational inference for Bayesian mixtures of factor analysers. Advances in Neural Information Processing System , 2000, 12: 449-455
10 Nielsen F B. Variational approach to factor analysis and related models. Dissertation for the Master’s Degree . Lyngby: Technical University of Denmark, 2004
11 Hills S E, Smith A F. Parameterization issues in Bayesian inference. Bayesian Statistics , 1992, 4: 227-246
12 Kass R E, Slate E H. Reparameterization and diagnostics of posterior nonnormality. Bayesian Statistics , 1992, 4: 289-305
13 Gelman A. Parameterization and Bayesian modeling. Journal of the American Statistical Association , 2004, 99(466): 537-545
doi: 10.1198/016214504000000458
14 Ghosh J, Dunson D B. Default prior distributions and efficient posterior computation in Bayesian factor analysis. Journal of Computational and Graphical Statistical Statistics , 2009, 18(2): 306-320
doi: 10.1198/jcgs.2009.07145
15 Xu L. Bayesian Ying Yang system and theory as a unified statistical learning approach: (i) unsupervised and semiunsupervised learning. In: Amari S, Kassabov N, eds. Brainlike Computing and Intelligent Information Systems . Berlin: Springer-Verlag, 1997, 241-274
16 Xu L. Bayesian Ying-Yang learning theory for data dimension reduction and determination. Journal of Computational Intelligence in Finance , 1998, 6(5): 6-18
17 Xu L. BYY harmony learning, independent state space, and generalized APT financial analyses. IEEE Transactions on Neural Networks , 2001, 12(4): 822-849
doi: 10.1109/72.935094
18 Hu X, Xu L. A comparative investigation on subspace dimension determination. Neural Networks , 2004, 17(8-9): 1051-1059
doi: 10.1016/j.neunet.2004.07.005
19 Shi L, Xu L. Local factor analysis with automatic model selection: a comparative study and digits recognition application. In: Proceedings Part II of the 16th International Conference on Artificial Neural Networks . 2006, 260-269
20 Jordan M I, Ghahramani Z, Jaakkola T S, Saul L K. An introduction to variational methods for graphical models. Machine Learning , 1999, 37(2): 183-233
doi: 10.1023/A:1007665907178
21 Beal M J. Variational algorithms for approximate Bayesian inference. Dissertation for the Doctoral Degree . London: University College London, 2003
22 Xu L. Bayesian Ying-Yang system, best harmony learning, and five action circling. Frontiers of Electrical and Electronic Engineering in China , 2010, 5(3): 281-328
doi: 10.1007/s11460-010-0108-9
23 Xu L. Fundamentals, challenges, and advances of statistical learning for knowledge discovery and problem solving: a BYY harmony perspective. In: Proceedings of International Conference on Neural Networks and Brain . 2005, 1: 24-55
24 Rubin D B, Thayer D T. EM algorithm for ML factor analysis. Psychometrika , 1982, 47(1): 69-76
doi: 10.1007/BF02293851
25 Bozdogan H, Ramirez D E. FACAIC: model selection algorithm for the orthogonal factor model using AIC and FACAIC. Psychometrika , 1988, 53(3): 407-415
doi: 10.1007/BF02294221
26 Tu S, Xu L. A study of several model selection criteria for determining the number of signals. In: Proceedings of IEEE International Conference on Acoustics Speech and Signal Processing . 2010, 1966-1969
doi: 10.1109/ICASSP.2010.5495287
27 Xu L. Bayesian-Kullback coupled YING-YANG machines: unified learning and new results on vector quantization. In: Proceedings of International Conference on Neural Information Processing . 1995, 977-988
28 Xu L. Codimensional matrix pairing perspective of BYY harmony learning: hierarchy of bilinear systems, joint decomposition of data-covariance, and applications of network biology. Frontiers of Electrical and Electronic Engineering in China , 2011, 6(1): 86-119
doi: 10.1007/s11460-011-0135-1
29 Xu L. Bayesian Ying Yang learning. Scholarpedia , 2007, 2(3): 1809
doi: 10.4249/scholarpedia.1809
30 Xu L. Bayesian Ying Yang system, best harmony learning, and Gaussian manifold based family. In: Zurada J, Yen G, Wang J, eds. Computational Intelligence: Research Frontiers. Berlin-Heidelberg: Springer-Verlag , 2008, 5050: 48-78
31 Tu S, Xu L. An investigation of several typical model selection criteria for detecting the number of signals. Frontiers of Electrical and Electronic Engineering in China , 2011 (in Press)
32 Asuncion A, Newman D. UCI machine learning repository, 2007. http://www.ics.uci.edu/~mlearn/MLRepository.html
[1] Lei XU. On essential topics of BYY harmony learning: Current status, challenging issues, and gene analysis applications[J]. Front Elect Electr Eng, 2012, 7(1): 147-196.
[2] Shikui TU, Lei XU. An investigation of several typical model selection criteria for detecting the number of signals[J]. Front Elect Electr Eng Chin, 2011, 6(2): 245-255.
[3] Penghui WANG, Lei SHI, Lan DU, Hongwei LIU, Lei XU, Zheng BAO. Radar HRRP statistical recognition with temporal factor analysis by automatic Bayesian Ying-Yang harmony learning[J]. Front Elect Electr Eng Chin, 2011, 6(2): 300-317.
[4] Lei SHI, Shikui TU, Lei XU. Learning Gaussian mixture with automatic model selection: A comparative study on three Bayesian related approaches[J]. Front Elect Electr Eng Chin, 2011, 6(2): 215-244.
[5] Lei XU. Codimensional matrix pairing perspective of BYY harmony learning: hierarchy of bilinear systems, joint decomposition of data-covariance, and applications of network biology[J]. Front Elect Electr Eng Chin, 2011, 6(1): 86-119.
[6] Ying CAO, Xin HAO, Xiaoen ZHU, Shunren XIA, . An adaptive region growing algorithm for breast masses in mammograms[J]. Front. Electr. Electron. Eng., 2010, 5(2): 128-136.
[7] Wei DONG, Jiandong LI, Zhuo LU, Linjing ZHAO. Parameter estimation for MIMO system based on MUSIC and ML methods[J]. Front Elect Electr Eng Chin, 2009, 4(2): 161-165.
[8] LUO Jun, OU Zhi-jian, WANG Zuo-ying. Eigenvoice-based MAP adaptation within correlation subspace[J]. Front. Electr. Electron. Eng., 2006, 1(2): 130-134.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed