北京邮电大学学报

  • EI核心期刊

北京邮电大学学报 ›› 2016, Vol. 39 ›› Issue (4): 98-102.doi: 10.13190/j.jbupt.2016.04.019

• 研究报告 • 上一篇    下一篇

基于元音模板匹配的声效多级检测

晁浩, 宋成, 刘志中   

  1. 河南理工大学 计算机科学与技术学院, 河南 焦作 454000
  • 收稿日期:2015-05-28 出版日期:2016-08-28 发布日期:2016-01-29
  • 作者简介:晁浩(1981-),男,博士,讲师,E-mail:chaohao1981@163.com.
  • 基金资助:
    国家自然科学基金项目(61300124,61403128,61502150);河南省基础与前沿技术研究计划资助项目(132300410332)

Multi-Level Detection of Vocal Effort Based on Vowel Template Matching

CHAO Hao, SONG Ceng, LIU Zhi-zhong   

  1. College of Computer Science and Technology, Henan Polytechnic University, Henan Jiaozuo 454000, China
  • Received:2015-05-28 Online:2016-08-28 Published:2016-01-29
  • Supported by:
     

摘要: 针对鲁棒语音识别中的声效模式检测问题,提出了一种分级检测方法. 首先使用整体谱特征训练高斯混合模型来判定语音信号是否耳语. 对于非耳语的语音信号,通过声学界标点检测来获取信号中的元音段,然后通过元音模板匹配来确定语音信号具体的声效模式. 在863-test测试集上进行的声效检测实验结果显示,除耳语识别精度略有下降外,其他4种声效模式的识别精度均有大幅度的提高. 实验结果表明了将语音信号整体特征与局部元音特征相结合在声效检测中的有效性.

关键词: 语音识别, 声效, 元音, 模板匹配, 高斯混合模型

Abstract: A two-stage detection method was proposed for the identification of vocal effort modes in robust speech recognition. Firstly, whisper identification of speech signal is performed by using Gaussian mixture model(GMMs) which are trained by global spectrum features. Secondly, vowels are acquired based on landmark detection for the speech signal which does not belong to the whisper mode, and the vocal effort mode of the speech signal is determined by vowel template matching. Experiments conducted on 863-test show that, accompanied by a slight decline for whisper mode, the significant improvement of recognition accuracy for the remaining four vocal effort modes can be achieved.

Key words: speech recognition, vocal effort, vowel, adaptive modulation, Gaussian mixture model

中图分类号: