北京邮电大学学报

  • EI核心期刊

北京邮电大学学报 ›› 2017, Vol. 40 ›› Issue (s1): 98-102.doi: 10.13190/j.jbupt.2017.s.022

• 论文 • 上一篇    下一篇

面向大规模嵌入式设备固件的自动化分析方法

王猛涛1,2, 刘中金3, 常青1,2, 陈昱1,2, 石志强1,2, 孙利民1,2   

  1. 1. 中国科学院 信息工程研究所, 北京 100093;
    2. 中国科学院大学 网络空间安全学院, 北京 100049;
    3. 国家计算机网络应急技术处理协调中心, 北京 100029
  • 收稿日期:2016-05-29 出版日期:2017-09-28 发布日期:2017-09-28
  • 作者简介:王猛涛(1989-),男,硕士生,E-mail:wangmengtao@iie.ac.cn;石志强(1970-),男,博士,正研级高级工程师.
  • 基金资助:
    国家自然科学基金项目(U1636120);国家重点研发计划项目(2016YFB0800202);中国科学院国防科技创新基金项目面上基金项目(CXJJ-16M118);工信部重点科研项目(JCKY2016602B001);北京市科委重点课题(Z161100002616032)

An Automated Analysis Method for Large-Scale Embedded Device Firmware

WANG Meng-tao1,2, LIU Zhong-jin3, CHANG Qing1,2, CHEN Yu1,2, SHI Zhi-qiang1,2, SUN Li-min1,2   

  1. 1. Institute of Information Engineering, Chinese Academy of Sciences, Beijing 100093, China;
    2. School of Cyber Secwrity, University of Chinese Academy of Sciences, Beijing 100049, China;
    3. National Computer Network Emergency Response Technical Team/Coordination Center of China, Beijing 100029, China
  • Received:2016-05-29 Online:2017-09-28 Published:2017-09-28

摘要: 设计了一种面向大规模嵌入式设备固件的自动化分析方法,该方法能够对固件进行自动化分析,提取其文件系统、操作系统、中央处理器指令架构等关键信息.针对固件解码成功的自动化判定难题,提出了一种基于分类回归树的固件解码状态检测算法,并选取收集的6 160个固件和固件自动化解码后得到的1 823个可反汇编二进制文件作为样本进行实验.实验结果表明,该算法相对其他分类器具有更好的分类效果,其分类准确率、召回率均在96%以上.

关键词: 嵌入式设备固件, 分类回归树, 状态检测

Abstract: An automated analysis method for large-scale embedded firmware was designed to get device information, such as file system type, operating system type, or CPU instruction set. But it was difficult to know whether it was decoded successfully during automated firmware analysis. To solve this problem, a firmware decoding status detection method was proposed based on classification and regression tree algorithm. The dataset contained 6 160 firmware samples and 1 823 disassembled binary files that were collected from firmware decoding. The experiments conducted on the dataset demonstrated that the proposed method had a considerable performance comparing with other classifiers, whose precision and recall rate are both above 96%.

Key words: embedded device firmware, classification and regression tree, status detection

中图分类号: