Journal of Beijing University of Posts and Telecommunications

  • EI核心期刊

Journal of Beijing University of Posts and Telecommunications ›› 2023, Vol. 46 ›› Issue (2): 122-128.

Previous Articles     Next Articles

Language Identification Based on Gammatone-Scale Power-Normalized Coefficients Spectrograms

1,yubin yubinshao1, 1, 1,Da-Chun ZHOU2   

  1. 1.
    2. Kunming University of Science and Technology
  • Received:2022-03-31 Revised:2022-06-23 Online:2023-04-28 Published:2023-05-14
  • Contact: yubin yubinshao E-mail:shaoyubin999@QQ.com

Abstract: Aiming at the low identification rate of language identification in noisy environment, a language identification method is proposed based on the Gammatone-scale power-normalized coefficients spectrograms, which are obtained by extracting coefficients as features based on the suppression of noise in power and the auditory features of the Gammatone filter-banks, and transformed into images as spectrograms. Then the dark channel prior algorithm and automatic color scale algorithm are applied to enhance and denoise the images. Finally, the residual neural network is used for training and identification. Experiments show that the identification rate of the proposed method is improved by 39.1%, 12.3%, 19.0%, 5.5%, 28.2% and 28.5% relative to the linear gray-scale spectrograms under the conditions of signal-to-noise ratio is 0dB and noise sources are white noise, volvo noise, pink noise, HF channel noise, babble noise and factory floor noise respectively. The identification rate under other signal-to-noise ratios is also improved.

Key words: language identification, auditory features, power-normalized, residual neural network

CLC Number: