北京邮电大学学报

  • EI核心期刊

北京邮电大学学报 ›› 2019, Vol. 42 ›› Issue (5): 62-68.doi: 10.13190/j.jbupt.2018-308

• 论文 • 上一篇    下一篇

一种在线集群异常作业预测方法

谢丽霞, 汪子荧   

  1. 中国民航大学 计算机科学与技术学院, 天津 300300
  • 收稿日期:2018-12-23 出版日期:2019-10-28 发布日期:2019-11-25
  • 作者简介:谢丽霞(1974-),女,教授,E-mail:lxxie@126.com.
  • 基金资助:
    国家自然科学基金民航联合研究基金项目(U1833107);国家科技重大专项项目(2012ZX03002002);中央高校基本科研业务费专项资金项目(ZYGX2018028)

An Online Cluster Anomaly Job Prediction Method

XIE Li-xia, WANG Zi-ying   

  1. School of Computer Science and Technology, Civil Aviation University of China, Tianjin 300300, China
  • Received:2018-12-23 Online:2019-10-28 Published:2019-11-25

摘要: 设计了作业子任务动态特征计算方式;其次依据此动态特征提出一种改进门控递归单元(IGRU)神经网络;然后采用IGRU根据动态特征实时预测任务终止状态是否异常;最后根据作业与其子任务运行状态之间的状态相关性检索异常作业,完成对异常作业的预测.实验结果表明,在线集群异常作业预测在预测灵敏度、误差率、精确度和预测时长方面与其他预测方法相比有明显提升;在保障集群平台安全方面具有一定的应用性.

关键词: 集群异常作业, 动态特征, 实时预测, 门控递归单元, 状态相关性

Abstract: An online cluster anomaly job prediction method (OCAJP) is proposed. Firstly, a calculation of dynamic features of sub-tasks in the job was designed. Secondly, an improved gated recurrent unit (IGRU) neural network was designed according to the dynamic features. Then, the IGRU was used to predict whether the sub-task's final status was abnormal according to its dynamic features. Finally, the anomaly job was obtained based on the status relevance between the job and its sub-tasks, so as to complete prediction of abnormal jobs. The experimental results showed that OCAJP had a significant improvement in prediction sensitivity, error rate, accuracy, and prediction time compared with other prediction methods; this method had applicability in protecting the security of the cluster platform.

Key words: cluster anomaly job, dynamic features, real-time prediction, gated recurrent unit, status relevance

中图分类号: