Journal of Beijing University of Posts and Telecommunications

  • EI核心期刊

Journal of Beijing University of Posts and Telecommunications ›› 2023, Vol. 46 ›› Issue (1): 38-43.

Previous Articles     Next Articles

Language Recognition Based on Log Gammatone-Scale Filter Bank Energies Spectrograms

ZHANG Haoge, SHAO Yubin, LONG Hua, PENG Yi, ZHOU Dachun   

  1.  Kunming University of Science and Technology
  • Received:2021-12-29 Revised:2022-02-21 Online:2023-02-28 Published:2023-02-22
  • Contact: SHAO yubin E-mail:shaoyubin999@qq.com

Abstract: To solve the low recognition rate issue of language recognition in noisy environment,a language recognition method based on the log Gammatone-scale filter bank energies is proposed. First the log Gammatone-scale filter bank energies features are extracted based on the auditory features of the Gammatone filter-banks, and the features are transformed into images to obtain feature spectrograms. Then, the dark channel prior is applied to enhance and denoise the images. Finally,the residual neural network model is used for training and recognition. Experimental results show that when the signal-to-noise ratio is 0 dB, and the noise sources are white noise,volvo noise and pink noise,the recognition rate of the proposed method is improved by 32.7% ,10.1% and 29.1% , respectively, compared with the linear gray-scale spectrogram,and the recognition rate under other signal-to-noise ratios is also improved.

Key words: language recognition , auditory features , Gammatone filters , residual neural network

CLC Number: