JOURNAL OF BEIJING UNIVERSITY OF POSTS AND TELECOM ›› 2018, Vol. 41 ›› Issue (1): 1-12.doi: 10.13190/j.jbupt.2017-150
• Review • Next Articles
The Key Techniques and Future Vision of Feature Selection in Machine Learning
CUI Hong-yan1,2,3, XU Shuai1,2,3, ZHANG Li-feng1,2,3, Roy E. Welsch4, Berthold K. P. Horn5
- 1. State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing 100876, China;
2. Key Laboratory of Network System Architecture and Convergence, Beijing University of Posts and Telecommunications, Beijing 100876, China;
3. Beijing Laboratory of Advanced Information Networks, Beijing 100876, China;
4. Sloan School of Management, Massachusetts Institute of Technology, MA 02139, USA;
5. Csail Laboratory, Massachusetts Institute of Technology, MA 02139, USA
-
Received:
2017-07-20Online:
2018-02-28Published:
2018-01-04
CLC Number:
Cite this article
CUI Hong-yan, XU Shuai, ZHANG Li-feng, Roy E. Welsch, Berthold K. P. Horn. The Key Techniques and Future Vision of Feature Selection in Machine Learning[J]. JOURNAL OF BEIJING UNIVERSITY OF POSTS AND TELECOM, 2018, 41(1): 1-12.
share this article
Add to citation manager EndNote|Ris|BibTeX
URL: https://journal.bupt.edu.cn/EN/10.13190/j.jbupt.2017-150
[1] Churchland P S, Sejnowski T J. The computational brain[M]. Cambridge:MIT Press, 2016:1-120. [2] Rizzo G, Troncy R, Hellmann S, et al. NERD meets NIF:lifting NLP extraction results to the linked data cloud[EB/OL]. Lyon:LDOW, 2012[2017-4-21].http://www.eurecom.fr/fr/publication/3675/download/mm-publi-3675.pdf [3] Koehn P, Hoang H, Birch A, et al. Moses:open source toolkit for statistical machine translation[C]//Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions. Prague:Association for Computational Linguistics, 2007:177-180. [4] Pastor G C, Mitkov R, Afzal N, et al. Translation universals:do they exist? a corpus-based NLP study of convergence and simplification[C]//8th AMTA Conference. Hawaii:Arts and Humanities Language and Linguistics, 2008:75-81. [5] Bernard D E. Multimodal natural language query system and architecture for processing voice and proximity-based queries:U.S. 376. 645[P]. 2008-05-20. [6] Bernard D E. Multimodal natural language query system for processing and analyzing voice and proximity-based queries:U.S. 873. 654[P]. 2011-01-18. [7] 王序文, 王小捷, 孙月萍. 双语主题跨语言伪相关反馈[J]. 北京邮电大学学报, 2013, 36(4):81-84. Wang Xuwen, Wang Xiaojie, Sun Yueping. Cross-lingual pseudo relevance feedback based on bilingual topics[J]. Journal of Beijing University of Posts and Telecommunications, 2013, 36(4):81-84. [8] Pedersen T, Pakhomov S V S, Patwardhan S, et al. Measures of semantic similarity and relatedness in the biomedical domain[J]. Journal of Biomedical Informatics, 2007, 40(3):288-299. [9] 赵学军, 李育珍, 雷书彧. 基于遗传算法优化的稀疏表示图像融合算法[J]. 北京邮电大学学报, 2016, 39(2):73-76, 87. Zhao Xuejun, Li Yuzhen, Lei Shuyu. An image fusion method with sparse representation based on genetic algorithm optimization[J]. Journal of Beijing University of Posts and Telecommunications, 2016, 39(2):73-76, 87. [10] Sonka M, Hlavac V, Boyle R. Image processing, analysis, and machine vision[M]. Boston:Cengage Learning, 2014:312-389. [11] Akata Z, Perronnin F, Harchaoui Z, et al. Good practice in large-scale learning for image classification[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(3):507-520. [12] 李亚, 王广润, 王青. 基于深度卷积神经网络的跨年龄人脸识别[J]. 北京邮电大学学报, 2017, 40(1):84-88, 110. Li Ya, Wang Guangrun, Wang Qing. A deep joint learning approach for age invariant face verification[J]. Journal of Beijing University of Posts and Telecommunications, 2017, 40(1):84-88, 110. [13] Wang S J, Chen H L, Yan W J, et al. Face recognition and micro-expression recognition based on discriminant tensor subspace analysis plus extreme learning machine[J]. Neural Processing Letters, 2014, 39(1):25-43. [14] Sridhar S, Oulasvirta A, Theobalt C. Interactive markerless articulated hand motion tracking using RGB and depth data[C]//Proceedings of the IEEE International Conference on Computer Vision. Sydney:IEEE, 2013:2456-2463. [15] Behoora I, Tucker C S. Machine learning classification of design team members' body language patterns for real time emotional state detection[J]. Design Studies, 2015(39):100-127. [16] Chen X, Zheng Z, Yu Q, et al. Web service recommendation via exploiting location and QoS information[J]. IEEE Transactions on Parallel and Distributed Systems, 2014, 25(7):1913-1924. [17] Skowron P, Faliszewski P, Lang J. Finding a collective set of items:from proportional multirepresentation to group recommendation[J]. Artificial Intelligence, 2016(241):191-216. [18] Broder A, Fontoura M, Josifovski V, et al. A semantic approach to contextual advertising[C]//Proceedings of the 30th annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Amsterdam:ACM, 2007:559-566. [19] Pang B, Lee L, Vaithyanathan S. Thumbs up?:sentiment classification using machine learning techniques[C]//Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing. Stroudsburg:Association for Computational Linguistics, 2002:79-86. [20] Glorot X, Bordes A, Bengio Y. Domain adaptation for large-scale sentiment classification:a deep learning approach[C]//Proceedings of the 28th International Conference on Machine Learning (ICML-11). Boston:IMLS, 2011:513-520. [21] Daum F, Huang J. Curse of dimensionality and particle filters[C]//Aerospace Conference, 2003. Proceedings. 2003 IEEE. Montana:IEEE, 2003:1979-1993. [22] Hawkins D M. The problem of overfitting[J]. Journal of Chemical Information and Computer Sciences, 2004, 44(1):1-12. [23] Domingos P. A few useful things to know about machine learning[J]. Communications of the ACM, 2012, 55(10):78-87. [24] 周志华. 机器学习[M]. 北京:清华大学出版社, 2016:247-266. [25] Xing E P, Jordan M I, Karp R M. Feature selection for high-dimensional genomic microarray data[C]//ICML. Williamstown:ICML, 2001:601-608. [26] Singh S R, Murthy H A, Gonsalves T A. Feature selection for text classification based on gini coefficient of inequality[J]. FSDM, 2010(10):76-85. [27] Yang Y, Pedersen J O. A comparative study on feature selection in text categorization[C]//ICML. Nashville:ICML, 1997:412-420. [28] Peng H, Long F, Ding C. Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(8):1226-1238. [29] Witten I H, Frank E, Hall M A, et al. Data mining:practical machine learning tools and techniques[M]. Burlington:Morgan Kaufmann, 2016. [30] Tibshirani R. Regression shrinkage and selection via the lasso[J]. Journal of the Royal Statistical Society. Series B (Methodological), 1996, 58(1):267-288. [31] Hoerl A E, Kennard R W. Ridge regression:biased estimation for nonorthogonal problems[J]. Technometrics, 1970, 12(1):55-67. [32] Murphy K P. Machine learning:a probabilistic perspective[M]. Cambridge:MIT Press, 2012. [33] Zou H, Hastie T. Regularization and variable selection via the elastic net[J]. Journal of the Royal Statistical Society:Series B (Statistical Methodology), 2005, 67(2):301-320. [34] Zhou Y, Jin R, Hoi S. Exclusive lasso for multi-task feature selection[C]//International Conference on Artificial Intelligence and Statistics. Sardinia, Italy:[s.n.], 2010:988-995. [35] Efron B, Hastie T, Johnstone I, et al. Least angle regression[J]. The Annals of Statistics, 2004, 32(2):407-499. [36] Sun Y, Chen Y, Wang X, et al. Deep learning face representation by joint identification-verification[C]//Advances in Neural Information Processing Systems. Montreal:NIPS, 2014:1988-1996. [37] Breiman L. Bagging predictors[J]. Machine Learning, 1996, 24(2):123-140. [38] Breiman L. Random forests[J]. Machine Learning, 2001, 45(1):5-32. [39] Rätsch G, Onoda T, Müller K R. Soft margins for adaBoost[J]. Machine Learning, 2001, 42(3):287-320. [40] De'Ath G. Boosted trees for ecological modeling and prediction[J]. Ecology, 2007, 88(1):243-251. [41] Chen T, Guestrin C. Xgboost:a scalable tree boosting system[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco:ACM, 2016:785-794. [42] Burrows W R, Benjamin M, Beauchamp S, et al. CART decision-tree statistical analysis and prediction of summer season maximum surface ozone for the Vancouver, Montreal, and Atlantic regions of Canada[J]. Journal of Applied Meteorology, 1995, 34(8):1848-1862. [43] Menze B H, Kelm B M, Masuch R, et al. A comparison of random forest and its Gini importance with standard chemometric methods for the feature selection and classification of spectral data[J]. BMC Bioinformatics, 2009, 10(1):213. [44] Díaz-Uriarte R, De Andres S A. Gene selection and classification of microarray data using random forest[J]. BMC Bioinformatics, 2006, 7(1):3. [45] Pal M. Random forest classifier for remote sensing classification[J]. International Journal of Remote Sensing, 2005, 26(1):217-222. [46] Chen X W, Liu M. Prediction of protein-protein interactions using random decision forest framework[J]. Bioinformatics, 2005, 21(24):4394-4400. [47] Liu G, Nguyen T T, Zhao G, et al. Repeat buyer prediction for e-commerce[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco:ACM, 2016:155-164. [48] Volkovs M. Two-stage approach to item recommendation from user sessions[C]//Proceedings of the 2015 International ACM Recommender Systems Challenge. Vienna:ACM, 2015:3. [49] Hinton G E, Salakhutdinov R R. Reducing the dimensionality of data with neural networks[J]. Science, 2006, 313(5786):504-507. [50] LeCun Y, Bengio Y, Hinton G. Deep learning[J]. Nature, 2015, 521(7553):436-444. [51] Deng L, Yu D. Deep learning:methods and applications[J]. Foundations and Trends® in Signal Processing, 2014, 7(3-4):197-387. [52] Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[C]//Advances in Neural Information Processing Systems. Stateline:2012:1097-1105. [53] Kim Y. Convolutional neural networks for sentence classification[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Doha:Association for Computational Linguistics, 2014:1746-1751. [54] Ji S, Xu W, Yang M, et al. 3D convolutional neural networks for human action recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(1):221-231. [55] Graves A, Schmidhuber J. Framewise phoneme classification with bidirectional LSTM and other neural network architectures[J]. Neural Networks, 2005, 18(5):602-610. [56] Sukhbaatar S, Weston J, Fergus R. End-to-end memory networks[C]//Advances in Neural Information Processing Systems. Montreal:IEEE, 2015:2440-2448. [57] Wen T H, Gasic M, Mrksic N, et al. Semantically conditioned lstm-based natural language generation for spoken dialogue systems[EB/OL]. Issa card City:arXiv Preprint arXiv, 2015[2017-3-12]. https://arxiv.org/abs/1503.00075. [58] He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Seattle:[s.n.], 2016:770-778. [59] Hinton G E. To recognize shapes, first learn to generate images[J]. Progress in Brain Research, 2007(165):535-547. [60] Radford A, Metz L, Chintala S. Unsupervised representation learning with deep convolutional generative adversarial networks[J]. arXiv Preprint arXiv:1511.06434, 2015. [61] Ioffe S, Szegedy C. Batch normalization:Accelerating deep network training by reducing internal covariate shift[J]. arXiv Preprint arXiv:1511.06434, 2015. [62] Srivastava N, Hinton G, Krizhevsky A, et al. Dropout:a simple way to prevent neural networks from overfitting[J]. The Journal of Machine Learning Research, 2014, 15(1):1929-1958. [63] Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[C]//Advances in Neural Information Processing Systems. Stateline:[s.n.], 2012:1097-1105. [64] Szegedy C, Liu W, Jia Y, et al. Going deeper with convolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston:IEEE, 2015:1-9. [65] He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Seattle:IEEE, 2016:770-778. [66] Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative adversarial nets[C]//Advances in Neural Information Processing Systems. Montreal:[s.n.], 2014:2672-2680. [67] Ledig C, Theis L, Huszár F, et al. Photo-realistic single image super-resolution using a generative adversarial network[EB/OL]. Issa card City:arXiv Preprint arXiv, 2016[2017-2-3]. https://arxiv.org/abs/1609.04802, 2016. [68] Reed S, Akata Z, Yan X, et al. Generative adversarial text to image synthesis[C]//Proceedings of The 33rd International Conference on Machine Learning. New York:[s.n.], 2016:3. [69] Li J, Monroe W, Shi T, et al. Adversarial learning for neural dialogue generation[EB/OL]. Issa card City:arXiv Preprint arXiv,2017[2017-5-15]. https://arxiv.org/abs/1701.06547. [70] Jia Y, Shelhamer E, Donahue J, et al. Caffe:convolutional architecture for fast feature embedding[C]//Proceedings of the 22nd ACM International Conference on Multimedia. Orlando:ACM, 2014:675-678. [71] Abadi M, Agarwal A, Barham P, et al. Tensorflow:large-scale machine learning on heterogeneous distributed systems[EB/OL]. Issa card City:arXiv Preprint arXiv, 2016[2017-5-10]. https://arxiv.org/abs/1603.04467. [72] Vincent P, Larochelle H, Bengio Y, et al. Extracting and composing robust features with denoising autoencoders[C]//Proceedings of the 25th International Conference on Machine Learning. Helsinki:ACM, 2008:1096-1103. [73] Erhan D, Bengio Y, Courville A, et al. Why does unsupervised pre-training help deep learning?[J]. Journal of Machine Learning Research, 2010, 11(2):625-660. [74] Pan S J, Yang Q. A survey on transfer learning[J]. IEEE Transactions on Knowledge and Data Engineering, 2010, 22(10):1345-1359. [75] Dunteman G H. Principal components analysis[M]. New York:Sage Publications, 1989:31-70. [76] Klema V, Laub A. The singular value decomposition:its computation and some applications[J]. IEEE Transactions on Automatic Control, 1980, 25(2):164-176. [77] Ke Y, Sukthankar R. PCA-SIFT:a more distinctive representation for local image descriptors[C]//Computer Vision and Pattern Recognition, 2004, CVPR 2004[C]//Proceedings of the 2004 IEEE Computer Society Conference on. Washington:IEEE, 2004:506-513. [78] Patil U, Mudengudi U. Image fusion using hierarchical PCA[C]//Image Information Processing (ICⅡP), 2011 International Conference on. Shimla:IEEE, 2011:1-6. [79] Berry M W. Survey of text mining[J]. Computing Reviews, 2004, 45(9):548. |
[1] | CHANG Xiao, HUANG Zhibin, YU Min, YANG Wubing. A Deep Decision Tree Model for Aerospace Big Data [J]. Journal of Beijing University of Posts and Telecommunications, 2023, 46(3): 1-6. |
[2] | . Research progress of potential applications of AI in 6G air interface physical layer [J]. Journal of Beijing University of Posts and Telecommunications, 2022, 45(6): 22-31. |
[3] | OUYANG Mowei, KANG Guixia, WANG Kailiang, WANG Yaming. Automated Hippocampal Sclerosis Detection Method with Radiomics Analysis [J]. Journal of Beijing University of Posts and Telecommunications, 2022, 45(4): 51-57. |
[4] | WU Zhen-yu, SHI Chang. Agile AIOps Framework and Maintenance Data Quality Assessment Method [J]. Journal of Beijing University of Posts and Telecommunications, 2021, 44(6): 96-102,133. |
[5] | JIA Jun, FENG Chun-yan, XIA Hai-lun, ZHANG Tian-kui, LI Cheng-gang. Communication Networks Fault Prediction Method Based on Sample Equalization and Feature Interaction [J]. Journal of Beijing University of Posts and Telecommunications, 2021, 44(6): 59-66. |
[6] | MU Xiao-hui, LI Li-xiang. Network Traffic Prediction of Dropout Echo State Network [J]. Journal of Beijing University of Posts and Telecommunications, 2021, 44(5): 10-13,20. |
[7] | LIU Zhan-feng, PAN Su. Fuzzy-Rough Bireducts Algorithm Based on Particle Swarm Optimization [J]. Journal of Beijing University of Posts and Telecommunications, 2021, 44(4): 49-55. |
[8] | TANG Yong-li, LI Xing-yu, ZHAO Zong-qu, LI Yun-feng. Detection Method for Android Payment Cracked Application [J]. Journal of Beijing University of Posts and Telecommunications, 2021, 44(4): 95-101. |
[9] | PENG Yu-he, CHEN Xiang, CHEN Shuang-wu, YANG Jian. Cross-Domain Abnormal Traffic Detection Based on Transfer Learning [J]. Journal of Beijing University of Posts and Telecommunications, 2021, 44(2): 33-39. |
[10] | Lü Xing-feng, LI Jin-bao. A Method of Detecting Sleep Apnea Using Random Forest [J]. Journal of Beijing University of Posts and Telecommunications, 2020, 43(5): 64-70. |
[11] | ZHANG De-gan, CHEN Lu, CHEN Chen, ZHANG Ting, CUI Yu-ya. A New Algorithm of QoS Constrained Routing for Node Energy Optimization of Edge Computing [J]. Journal of Beijing University of Posts and Telecommunications, 2020, 43(4): 101-105. |
[12] | PENG Mu-gen, SUN Yao-hua, WANG Wen-bo. Intelligent-Concise Radio Access Networks in 6G: Architecture, Techniques and Insight [J]. Journal of Beijing University of Posts and Telecommunications, 2020, 43(3): 1-10. |
[13] | ZHOU Yi-lin, YANG Lu, LU Wen-rui. Exploring the Life Modeling Methods for Electrochemical Migration Failure of Printed Circuit Board under Dust Particles [J]. Journal of Beijing University of Posts and Telecommunications, 2020, 43(3): 11-18,31. |
[14] | ZHANG Guo-dong, YING Huan, YANG Shou-guo, SHI Zhi-qiang, LI Ji-yuan. Automatic Identification and Cracking Method for Vulnerable Hash Functions of Embedded Firmwares [J]. Journal of Beijing University of Posts and Telecommunications, 2020, 43(1): 46-53. |
[15] | PENG Yan, ZHAO Zi-ru, WU Ting-xian, WANG Jie. Prediction of PM2.5 Concentration Based on Ensemble Learning [J]. JOURNAL OF BEIJING UNIVERSITY OF POSTS AND TELECOM, 2019, 42(6): 162-169. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||