[1] Google cluster data[EB/OL]. (2010-01-10)[2018-09-10]. http://googleresearch.blogspot.com/2010/01/google-cluster-data.html.
[2] 刘春红, 韩晶晶, 商彦磊. 基于SVM分类的云集群失败作业主动预测方法[J]. 北京邮电大学学报, 2016, 39(5):104-109. Liu Chunhong, Han Jingjing, Shang Yanlei. Predicting job failure in cloud cluster:based on SVM classification[J]. Journal of Beijing University of Posts and Telecommunications, 2016, 39(5):104-109.
[3] 王意洁, 孙伟东, 周松, 等. 云计算环境下的分布存储关键技术[J]. 软件学报, 2012, 23(4):962-986. Wang Yijie, Sun Weidong, Zhou Song, et al. Key technologies of distributed storage for cloud computing[J]. Journal of Software, 2012, 23(4):962-986.
[4] Soualhia M, Khomh F, Tahar S. Predicting scheduling failures in the cloud:a case study with Google clusters and Hadoop on Amazon EMR[C]//2015 IEEE 17th International Conference on High Performance Computing and Communications. Piscataway:IEEE Press, 2015:58-65.
[5] Chen X, Lu C D, Pattabiraman K. Failure analysis of jobs in compute clouds:a Google cluster case study[C]//2014 IEEE 25th International Symposium on Software Reliability Engineering. Piscataway:IEEE Press, 2014:167-177.
[6] Jakobik, Agnieszka, Grzonka D, Palmieri F. Non-deterministic security driven meta scheduler for distributed cloud organizations[J]. Simulation Modelling Practice and Theory, 2017, 76(8):67-81.
[7] Sonoda M, Kikuchi S, Watanabe Y, et al. Online failure prediction in cloud datacenters by real-time message pattern learning[C]//Proceedings of the 2012 IEEE 4th International Conference on Cloud Computing Technology and Science Proceedings. Piscataway:IEEE Press, 2012:504-511.
[8] 唐红艳, 李影, 贾统, 等. 基于时间序列分析的杀手级任务在线识别方法[J]. 计算机科学, 2017, 44(4):43-46. Tang Hongyan, Li Ying, Jia Tong, et al. Time series based killer task online recognition approach[J]. Computer Science, 2017, 44(4):43-46.
[9] Tang H, Li Y, Jia T, et al. Hunting killer tasks for cloud system through behavior pattern learning[C]//IEEE/IFIP International Conference on Dependable Systems & Networks Workshop. Piscataway:IEEE Press, 2016:1-12.
[10] Liu C, Han J, Shang Y, et al. Predicting of job failure in compute cloud based on online extreme learning machine:a comparative study[J]. IEEE Access, 2017, 5(99):9359-9368.
[11] Garraghan P, Townend P, Xu J. An empirical failure-analysis of a large-scale cloud computing environment[C]//2014 IEEE 15th International Symposium on High-Assurance Systems Engineering. Piscataway:IEEE Press, 2014:113-120.
[12] Rosa A, Chen L Y, Binder W. Predicting and mitigating jobs failures in big data clusters[C]//2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing. Piscataway:IEEE Press, 2015:221-230.
[13] Yamnual K, Phunchongharn P, Achalakul T. Failure detection through monitoring of the scientific distributed system[C]//2017 International Conference on Applied System Innovation. Piscataway:IEEE Press, 2017:568-571.
[14] Cho K, Van Merrienboer B, Gulcehre C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[J]. Computer Science, 2014, 55(9):1406-1420.
[15] Lipton Z C, Berkowitz J, Elkan C. A critical review of recurrent neural networks for sequence learning[J]. Computer Science, 2015, 56(10):1506-1543. |