|
|
Approximation of kernel matrices by circulant
matrices and its application in kernel selection methods |
Guohui SONG1,Yuesheng XU2, 3, |
1.Department of Mathematics,
Syracuse University, Syracuse, NY 13244, USA; 2.Department of Mathematics,
Syracuse University, Syracuse, NY 13244, USA;School of Mathematics
and Computational Sciences, Sun Yat-sen University, Guangzhou 510275,
China; 3.2010-03-15 9:33:37; |
|
|
Abstract This paper focuses on developing fast numerical algorithms for selection of a kernel optimal for a given training data set. The optimal kernel is obtained by minimizing a cost functional over a prescribed set of kernels. The cost functional is defined in terms of a positive semi-definite matrix determined completely by a given kernel and the given sampled input data. Fast computational algorithms are developed by approximating the positive semi-definite matrix by a related circulant matrix so that the fast Fourier transform can apply to achieve a linear or quasi-linear computational complexity for finding the optimal kernel. We establish convergence of the approximation method. Numerical examples are presented to demonstrate the approximation accuracy and computational efficiency of the proposed methods.
|
Keywords
Optimal kernel
reproducing kernel
reproducing kernel Hilbert space
learning with kernel
circulant matrix
B-spline kernel
|
Issue Date: 05 March 2010
|
|
|
Argyriou A, Micchelli C A, Pontil M, Ying Y. A spectralregularization framework for multi-task structure learning. NIPS, 2008, 20: 25―32
|
|
Aronszajn N. Theoryof reproducing kernels. Trans Amer MathSoc, 1950, 68: 337―404
doi: 10.2307/1990404
|
|
Bhatia R. MatrixAnalysis. New York: Springer, 1997
|
|
Bochner S. Lectureson Fourier Integrals. Princeton: Princeton University Press, 1959
|
|
de Boor C. APractical Guide to Splines. New York: Springer-Verlag, 1978
|
|
Bousquet O, Herrmann D J L. On the complexity of learningthe kernel matrix. NIPS, 2003, 15: 399―406
|
|
Chapelle O, Vapnik V, Bousquet O, Mukherjee S. Choosing multiple parameters for support vector machines. Machine Learning, 2002, 46: 131―159
doi: 10.1023/A:1012450327387
|
|
Cormen T H, Leiserson C E, Rivest R L, Stein C. Introductionto Algorithms. Boston: MIT Press, 2001
|
|
Cristianini N, Shawe-Taylor J, Elisseeff A, Kandola J. On kernel-target alignment. NIPS, 2002, 14: 367―373
|
|
Cucker F, Smale S. On the mathematical foundationsof learning. Bull Amer Math Soc, 2002, 39: 1―49
doi: 10.1090/S0273-0979-01-00923-5
|
|
Davis P J. Circulant Matrices. New York: John Wiley & SonsInc, 1979
|
|
Demko S, Moss W, Smith P. Decay rates for inverses of band matrices. Math Comput, 1984, 43: 491―499
doi: 10.2307/2008290
|
|
Drucker H, Wu D, Vapnik V N. Support vector machines for span categorization. IEEE Trans Neural Networks, 1999, 10: 1048―1054
doi: 10.1109/72.788645
|
|
Durrett R. Probability:Theory and Examples. Belmont: Duxbury Press, 1996
|
|
Gasquet C, Witomski P. Fourier Analysis and Applications. New York: Springer-Verlag, 1999
|
|
Gelfand I, Raikov D, Shilov G. Commutative Normed Rings. Bronx: Chelsea Publishing Company, New York, 1964
|
|
Gray R M. Toeplitz and Circulant Matrices: A Review. Boston: Now Publishers Inc, 2006
|
|
Grenander U, Szegö G. Toeplitz Forms and TheirApplications. Berkeley and Los Angeles: University of Calif Press, 1958
|
|
Horn R A, Johnson C R. Matrix Analysis. Cambridge: Cambridge University Press, 1986
|
|
Kimeldorf G, Wahba G. Some results on Tchebycheffianspline functions. J Math Anal Appl, 1971, 33: 82―95
doi: 10.1016/0022-247X(71)90184-3
|
|
Lanckriet G R G, Cristianini N, Bartlett P, El Ghaoui L, Jordan M I. Learning the kernel matrixwith semi-definite programming. J MachLearn Res, 2004, 5: 27―72
|
|
Lin Y, Brown L D. Statistical properties ofthe method of regularization with periodic Gaussian reproducing kernel. Ann Statis, 2004, 32: 1723―1743
doi: 10.1214/009053604000000454
|
|
Micchelli C A, Pontil M. Learning the kernel functionvia regularization. J Mach Learn Res, 2005, 6: 1099―1125
|
|
Micchelli C A, Pontil M. On learning vector-valuedfunctions. Neural Comput, 2005, 17: 177―204
doi: 10.1162/0899766052530802
|
|
Müller K-R, Smola A J, Rätsch G, Schölkopf B, Kohlmorgen J, Vapnik V N. Predicting time series with support vector machines. Lecture Notes in Computer Science, 1997, 1327: 999―1004
doi: 10.1007/BFb0020283
|
|
Schölkopf B, Smola A J. Learning with Kernels: SupportVector Machines, Regularization, Optimization, and Beyond. Cambridge: MITPress, 2004
|
|
Serre T, Wolf L, Bileschi S, Riesenhuber M, Poggio T. Robust object recognitionwith cortex-like mechanisms. IEEE TransPattern Analysis and Machine Intelligence, 2007, 29: 411―426
doi: 10.1109/TPAMI.2007.56
|
|
Shawe-Taylor J, Cristianini N. Kernel Methods for PatternAnalysis. Cambridge: Cambridge University Press, 2004
|
|
Smola A J, Schölkopf B. On a kernel-based methodfor pattern recognition, regression, approximation, and operator inversion. Algorithmica, 1998, 22: 211―231
doi: 10.1007/PL00013831
|
|
Sung K K, Poggio T. Example-based learning forview-based human face detection. IEEE TransPattern Analysis and Machine Intelligence, 1998, 20: 39―51
doi: 10.1109/34.655648
|
|
Vapnik V N. Statistical Learning Theory. New York: Wiley, 1998
|
|
Xu Y, Zhang H. Refinable kernels. J Mach Learn Res, 2007, 8: 2083―2120
|
|
Xu Y, Zhang H. Refinement of reproducingkernels. J Mach Learn Res, 2009, 10: 137―170
|
|
Ying Y, Zhou D X. Learnability of Gaussianswith flexible variances. J Mach Learn Res, 2007, 8: 249―276
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|