Please wait a minute...
Frontiers of Electrical and Electronic Engineering

ISSN 2095-2732

ISSN 2095-2740(Online)

CN 10-1028/TM

Front Elect Electr Eng Chin    2011, Vol. 6 Issue (4) : 542-546    https://doi.org/10.1007/s11460-011-0181-8
RESEARCH ARTICLE
Speech enhancement based on modified a priori SNR estimation
Yu FANG(), Gang LIU, Jun GUO
Pattern Recognition and Intelligent System Laboratory, Beijing University of Posts and Telecommunications, Beijing 100876, China
 Download: PDF(116 KB)   HTML
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

To solve the frame delay problem and match the previous frame, Plapous et al. [IEEE Transactions on Audio, Speech, and Language Processing, 2006, 14(6): 2098–2108] introduced a novel approach called two-step noise reduction (TSNR) technique to improve the performance of the speech enhancement system. However, TSNR approach results in spectral peaks of short duration and the broken spectral outlier, which degrade the spectral characteristics of the speech. To solve this problem, a cepstral smoothing step is added in order to remove these spectral peaks brought by TSNR approach. Theory analysis shows that the proposed approach can effectively smooth the spectral peaks and keep the spectral outlier so as to protect the speech characteristics. Experiment results also show that the proposed approach can bring significant improvement compared to decision-directed (DD) and TSNR approaches, especially in non-stationary noisy environments.

Keywords speech enhancement      decision-directed (DD)      two-step noise reduction (TSNR)      signal-to-noise ratio (SNR) estimation     
Corresponding Author(s): FANG Yu,Email:anniefangyu@gmail.com   
Issue Date: 05 December 2011
 Cite this article:   
Yu FANG,Gang LIU,Jun GUO. Speech enhancement based on modified a priori SNR estimation[J]. Front Elect Electr Eng Chin, 2011, 6(4): 542-546.
 URL:  
https://academic.hep.com.cn/fee/EN/10.1007/s11460-011-0181-8
https://academic.hep.com.cn/fee/EN/Y2011/V6/I4/542
Fig.1  Block diagram of modified noise reduction system
noise typeinput SNR/dBaveraged segmental SNR improvements/dB
DDTSNRthe proposed approach
white05.315.746.42
53.964.505.43
102.693.494.20
babble04.454.635.23
53.323.984.83
102.613.573.92
HF channel04.184.474.97
53.933.944.62
102.853.634.16
Tab.1  Output segmental SNRs improvements using the proposed approach, DD approach and TSNR approach in various noise and SNR condition
Fig.2  Magnitude of speech. (a) Clean speech; (b) noisy speech signal corrupted by HF channel noise at 5 dB; (c) enhanced speech using DD approach; (d) enhanced speech using TSNR approach; (e) enhanced speech using the proposed approach
1 Ephraim Y, Malah D. Speech enhancement using a minimum mean square error short time spectral amplitude estimator. IEEE Transactions on Acoustics, Speech, and Signal Processing , 1984, 32(6): 1109-1121
doi: 10.1109/TASSP.1984.1164453
2 Boll S F. Suppression of acoustic noise in speech using spectral subtraction. IEEE Transactions on Acoustics, Speech, and Signal Processing , 1979, 27(2): 113-120
doi: 10.1109/TASSP.1979.1163209
3 Cohen I. Relaxed statistical model for speech enhancement and a priori SNR estimation. IEEE Transactions on Speech and Audio Processing , 2005, 13(5): 870-881
doi: 10.1109/TSA.2005.851940
4 Cohen I. Speech enhancement using a noncausal a priori SNR estimator. IEEE Signal Processing Letters , 2004, 11(9): 725-728
doi: 10.1109/LSP.2004.833478
5 Plapous C, Marro C, Scalart P. Improved signal-to-noise ratio estimation for speech enhancement. IEEE Transactions on Audio, Speech, and Language Processing , 2006, 14(6): 2098-2108
6 Mauler D, Gerkmann T, Martin R. An analysis of quefrency selective temporal smoothing of the cepstrum in speech enhancement. In: Proceedings of the 11th International Workshop on Acoustic Echo and Noise Control . 2008, 1-4
7 Noll A M. Cepstrum pitch estimation. Journal of the Acoustical Society of America , 1967, 41(2): 293-309
doi: 10.1121/1.1910339 pmid:6040805
8 Cappe O. Elimination of the musical noise phenomenon with the Ephraim and Malah noise suppressor. IEEE Transactions on Speech and Audio Processing , 1994, 2(2): 345-349
doi: 10.1109/89.279283
9 Garofolo J S, Lamel L F, Fisher W M, Fiscus J G, Pallett D S, Dahlgren N L, Zue V. DARPA TIMIT Acoustic-phonetic continuous speech corpus. NIST Speech Disc1-1.1 , 1993
10 Varga A, Steeneken H J M, Tomlinson M, Jones D. The NOISEX-92 study on the effect of additive noise on automatic speech recognition. The NOISEX-92 CD-ROMs , 1992
11 Deller J R Jr, Hansen J H L, Proakis J G. Discrete-Time Processing of Speech Signals. 2nd ed. New York: IEEE Press, 2000
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed