[1] Li Haizhou, Ma Bin, Lee K A.Spoken language recognition:from fundamentals to practice[J].Proceedings of the IEEE, 2013, 101(5):1136-1159. [2] 张卫强, 刘加.基于听感知特征的语种识别[J].清华大学学报(自然科学版), 2009, 49(1):78-81.Zhang Weiqiang, Liu Jia.Language recognition based on auditory perception characteristics[J].Journal of Tsinghua University (Natural Science Edition), 2009, 49(1):78-81. [3] Zissman M A.Comparison of four approaches to automa-tic language identification of telephone speech[J].IEEE Transactions on Speech and Audio Processing, 1996, 4(1):31. [4] Montavon G.Deep learning for spoken language identification[C]//2009 NIPS Workshop on Deep Learning for Speech Recognition and Related Applications.Vancouver:NIPS Foundation, 2009:1-4. [5] Jiang Bing, Song Yan, Wei Si, et al.Performance evaluation of deep bottleneck features for spoken language identification[C]//The 9th International Symposium on Chinese Spoken Language Processing.Singapore:IEEE, 2014:143-147. [6] Lopez-Moreno I, Gonzalez-Dominguez J, Plchot O, et al.Automatic language identification using deep neural networks[C]//2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).Florence:IEEE Press, 2014:5337-5341. [7] Geng Wang, Wang Wenfu, Zhao Yuanyuan, et al.End-to-end language identification using attention-based recurrent neural networks[C]//Interspeech 2016.San Francisco:ISCA, 2016:2944-2948. [8] Jin Ma, Song Yan, Mcloughlin I, et al.LID-senones and their statistics for language identification[J].IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2018, 26(1):171-183. [9] Cai Weicheng, Cai Zexin, Liu Wenbo, et al.Insights into end-to-end learning scheme for language identification[J].IEEE Signal Processing Society Sigport, 2018, 28(2):202-210. [10] Deshwal D, Sangwan P, Kumar D.Feature extraction methods in language identification:a survey[J].Wireless Personal Communications, 2019, 107(4):2071-2103. [11] Li Lin, Li Zheng, Liu Yan, et al.Deep joint learning for language recognition[J].Neural Networks:the Official Journal of the International Neural Network Society, 2021, 141(9):72-86. [12] Bhanja C C, Laskar M A, Laskar R H.Modelling multi-level prosody and spectral features using deep neural network for an automatic tonal and non-tonal pre-classification-based Indian language identification system[J].Language Resources and Evaluation, 2021, 55(3):689-730. [13] He Kaiming, Zhang Xiangyu, Ren Shaoqing, et al.Deep residual learning for image recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Las Vegas:IEEE Press, 2016:770-778. [14] 刘威.单通道语音水印与语音增强算法研究[D].南京:东南大学, 2017. [15] Franzoni V, Biondi G, Milani A.Emotional sounds of crowds:spectrogram-based analysis using deep learning[J].Multimedia Tools and Applications, 2020, 79(47/48):36063-36075. [16] 蓝雯飞, 汪敦志, 张盛兰.一种新的降维算法PCA_LLE在图像识别中的应用[J].中南民族大学学报(自然科学版), 2020, 39(1):85-90.Lan Wenfei, Wang Dunzhi, Zhang Shenglan.Application of a new dimensionality reduction algorithm PCA_LLE in image recognition[J].Journal of South-Central University for Nationalities (Natural Science Edition), 2020, 39(1):85-90. [17] Qaraei M, Abbaasi S, Ghiasi-Shirazi K.Randomized non-linear PCA networks-science direct[J].Information Sciences, 2021, 545:241-253. [18] Zhu Dong, Huang Ming, Yang Jingjing, et al.Identification of spoken language from webcast using deep con-volutional recurrent neural networks[C]//2019 International Conference on Information Technology.Sanya:Electrical and Electronic Engineering (ITEEE 2019), 2019:1147-1152. |