[1] Silverstein C, Marais H, Henzinger M, et al. Analysis of a very large web search engine query log[J]. ACM SIGIR Forum, 1999, 33(1):6-12.
[2] 余慧佳, 刘奕群, 张敏, 等. 基于大规模日志分析的搜索引擎用户行为分析[J]. 中文信息学报, 2007, 21(1):109-114. Yu Huijia, Liu Yiqun, Zhang Min, et al. Research in search engine user behavior based on log analysis[J]. Journal of Chinese Information Processing, 2007, 21(1):109-114.
[3] Prieto V M, Álvarez M, Cacheda F. SAAD, a content based web spam analyzer and detector[J]. Journal of Systems and Software, 2013, 86(11):2906-2918.
[4] Castillo C, Donato D, Gionis A, et al. Know your neighbors[C]//Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval-SIGIR'07. New York:ACM Press, 2007:423-430.
[5] Yu Mei, Zhang Jie, Wang Jianrong, et al. The research of spam web page detection method based on web page differentiation and concrete cluster centers[M]//Wireless Algorithms, Systems, and Applications. Cham:Springer International Publishing, 2018:820-826.
[6] Whang J, Jung Y, Dhillon I, et al. Fast asynchronous anti-trust rank for web spam detection[C]//WSDM Workshop on Misinformation and Misbehavior Mining on the Web (MIS2). Los Angeles:[s.n.], 2018:1-4.
[7] Oskuie M D, Razavi S N. A survey of web Spam detection techniques[J]. International Journal of Computer Applications Technology and Research, 2014, 3(3):180-185.
[8] Goh K L, Singh A K. Comprehensive literature review on machine learning structures for web spam classification[J]. Procedia Computer Science, 2015, 70:434-441.
[9] Lingala T, Saritha G. Towards evaluating web spam threats and countermeasures[J]. Intl J Innov Adv Comput Sci, 2018, 7(3):71-80.
[10] Wan Jing, Liu Mufan, Yi Junkai, et al. Detecting spam webpages through topic and semantics analysis[C]//2015 Global Summit on Computer & Information Technology (GSCIT). New York:IEEE Press, 2015:1-7.
[11] Mamun M S I, Rathore M A, Lashkari A H, et al. Detecting malicious URLs using lexical analysis[M]//Network and System Security. Cham:Springer International Publishing, 2016:467-482.
[12] Singh T, Kumari M, Mahajan S. Feature oriented fuzzy logic based web spam detection[J]. Journal of Information and Optimization Sciences, 2017, 38(6):999-1015.
[13] Fdez-Glez J, Ruano-Ordas D, Méndez J R, et al. A dynamic model for integrating simple web spam classification techniques[J]. Expert Systems with Applications, 2015, 42(21):7969-7978.
[14] Silva R M, Almeida T A, Yamakami A. Towards web spam filtering using a classifier based on the minimum description length principle[C]//2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA). New York:IEEE Press, 2016:470-475.
[15] Li Yuancheng, Nie Xiangqian, Huang Rong. Web spam classification method based on deep belief networks[J]. Expert Systems With Applications, 2018, 96:261-270.
[16] Barandela R, Valdovinos R M, Sánchez J S, et al. The imbalanced training sample problem:under or over sampling?[M]//Lecture Notes in Computer Science. Berlin, Heidelberg:Springer Berlin Heidelberg, 2004:806-814.
[17] Chawla N V, Bowyer K W, Hall L O, et al. SMOTE:synthetic minority over-sampling technique[J]. Journal of Artificial Intelligence Research, 2002, 16:321-357.
[18] Guo Haixiang, Li Yijing, Shang J, et al. Learning from class-imbalanced data:review of methods and applications[J]. Expert Systems With Applications, 2017, 73:220-239.
[19] Fawcett T. An introduction to ROC analysis[J]. Pattern Recognition Letters, 2006, 27(8):861-874.
[20] Maulik U, Bandyopadhyay S. Genetic algorithm-based clustering technique[J]. Pattern Recognition, 2000, 33(9):1455-1465.
[21] Chen Tianqi, Guestrin C. XGBoost[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining-KDD'16. New York:ACM Press, 2016:785-794.
[22] Castillo C, Donato D, Becchetti L, et al. A reference collection for web spam[J]. ACM SIGIR Forum, 2006, 40(2):11-24.
[23] Singh S, Singh A K. Web-spam features selection using CFS-PSO[J]. Procedia Computer Science, 2018, 125(125):568-575.
[24] Scarselli F, Tsoi A C, Hagenbuchner M, et al. Solving graph data issues using a layered architecture approach with applications to web spam detection[J]. Neural Networks, 2013, 48:78-90. |