Multi-class classifier of non-speech audio based
on Fisher kernel

doi:10.1007/s11460-009-0073-3

Front. Electr. Electron. Eng.

2010, Vol. 5

Issue (1) : 72-76 https://doi.org/10.1007/s11460-009-0073-3

Research articles

Multi-class classifier of non-speech audio based on Fisher kernel

Rongyan WANG,Gang LIU,Jun GUO,Yu FANG,

Pattern Recognition and Intelligent System Laboratory, Beijing University of Posts and Telecommunications, Beijing 100876, China;

Download: PDF(109 KB)
Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks

Abstract Traditional multi-class classification methods based on Fisher kernel combine generative models such as Gaussian mixture models (GMMs) of all the classes together. However, the combination generates high dimensional feature vectors and leads to large computation. In this paper, a new classification method is proposed. This method adopts an intelligent feature space selection strategy by clustering similar Gaussian mixtures in order to reduce the feature dimensions. Audio classification experiments show that the proposed method is more accurate and effective with less computation compared with traditional methods.

Keywords Fisher kernel support vector machine (SVM) Gaussian mixture model (GMM) mixture clustering

Issue Date: 05 March 2010

Cite this article:

Rongyan WANG,Jun GUO,Gang LIU, et al. Multi-class classifier of non-speech audio based on Fisher kernel[J]. Front. Electr. Electron. Eng., 2010, 5(1): 72-76.

URL:

https://academic.hep.com.cn/fee/EN/10.1007/s11460-009-0073-3
https://academic.hep.com.cn/fee/EN/Y2010/V5/I1/72

	Wold E, Blum T, Keislar D, Wheaton J. Content-basedclassification, search and retrieval of audio. IEEE MultiMedia, 1996, 3(3): 27―36 doi: 10.1109/93.556537
	Rice S V. Audio and video retrieval based on audio content. Comparisonics. White Paper, 1998
	Shirazi J, Ghaemmaghami S, Razzazi F. Improvements in audio classificationbased on sinusoidal modeling. In: Proceedingsof 2008 IEEE International Conference on Multimedia and Expo. 2008, 1485―1488
	Pan W J, Yao Y, Liu Z J, Huang W Y. Audio classification in a weighted SVM. In: Proceedings of International Symposium on Communications andInformation Technologies. 2007, 468―472
	Li X L, Du Z D, Zhang Y F. Kernel-based audio classification. In: Proceedings of 2006 International Conferenceon Machine Learning and Cybernetics. 2006, 3313―3316
	Slaney M. Mixtures of probability experts for audio retrieval andindexing. In: Proceedings of 2002 IEEEInternational Conference on Multimedia and Expo. 2002, 1: 345―348
	Guo G D, Li S Z. Content-basedaudio classification and retrieval by support vector machines. IEEE Transactions on Neural Networks, 2003, 14(1): 209―215 doi: 10.1109/TNN.2002.806626
	Giannakopoulos T, Pikrakis A, Theodoridis S. A multi-class audio classificationmethod with respect to violent content in movies using Bayesian networks. In: Proceedings of IEEE the 9th Workshop on MultimediaSignal Processing. 2007, 90―93
	Rabaoui A, Kadri H, Lachiri Z, Ellouze N. Usingrobust features with multi-class SVMs to classify noisy sounds. In: Proceedings of the 3rd International Symposiumon Communications, Control and Signal Processing. 2008, 594―599
	Jaakkola T, Diekhans M, Haussler D. A discriminative frameworkfor detecting remote protein homologies. Journal of Computational Biology, 2000, 7(1―2): 95―114 doi: 10.1089/10665270050081405
	Jaakkola T S, Haussler D. Exploitinggenerative models in discriminative classifiers. In: Solla S A, Leen T K, Müller K R, eds. Advances in Neural Information Processing Systems. Cambridge: MIT Press, 1999, 487―493
	Smith N D, Gales M J F. Using SVMsto Classify Variable Length Speech Patterns. Technical Report CUED/F-INFENG/TR.412. 2001
	Fine S, Navrátil J, Gopinath R A. A hybrid GMM/SVM approachto speaker identification. In: Proceedingsof 2001 IEEE International Conference on Acoustics, Speech, and SignalProcessing. 2001, 1: 417―420
	Chen L, Man H, Nefian A V. Face recognition based on multi-classmapping of Fisher scores. Pattern Recognition, 2005, 38(6): 799―811 doi: 10.1016/j.patcog.2004.11.003
	Aran O, Akarun L. Multi-classclassification strategies for Fisher scores of gesture and sign sequences. In: Proceedings of the 19th International Conferenceon Pattern Recognition. 2008, 1―4

[1]	Yafeng WANG, Fuchun SUN, Huaping LIU, Dongfang YANG. Maximal terminal region approach for MPC using subsets sequence[J]. Front Elect Electr Eng, 2012, 7(2): 270-278.
[2]	Hailong ZHU, Peng LIU, Jiafeng LIU, Xianglong TANG. A primary-secondary background model with sliding window PCA algorithm[J]. Front Elect Electr Eng Chin, 2011, 6(4): 528-534.
[3]	Lubin WANG, Hui SHEN, Baojuan LI, Dewen HU. Classification of schizophrenic patients and healthy controls using multiple spatially independent components of structural MRI data[J]. Front Elect Electr Eng Chin, 2011, 6(2): 353-362.
[4]	Lei SHI, Shikui TU, Lei XU. Learning Gaussian mixture with automatic model selection: A comparative study on three Bayesian related approaches[J]. Front Elect Electr Eng Chin, 2011, 6(2): 215-244.
[5]	Xiaobo CHEN, Jian YANG. Optimal locality preserving least square support vector machine[J]. Front Elect Electr Eng Chin, 2011, 6(2): 201-207.
[6]	Bao-Liang LU, Xiao-Lin WANG, Yang YANG, Hai ZHAO. Learning from imbalanced data sets with a Min-Max modular support vector machine[J]. Front Elect Electr Eng Chin, 2011, 6(1): 56-71.
[7]	Na SUN, Yajian ZHOU, Yixian YANG. Consistency of weighted feature set and polyspectral kernels in individual communication transmitter identification[J]. Front Elect Electr Eng Chin, 2010, 5(4): 488-492.
[8]	Huanjun LIU. Empty glass bottle inspection method based on fuzzy support vector machine neural network and machine vision[J]. Front Elect Electr Eng Chin, 2010, 5(4): 430-440.
[9]	Ying CAO, Xin HAO, Xiaoen ZHU, Shunren XIA, . An adaptive region growing algorithm for breast masses in mammograms[J]. Front. Electr. Electron. Eng., 2010, 5(2): 128-136.

Viewed

Full text

Abstract

Cited

Shared

Discussed