北京邮电大学学报

  • EI核心期刊

北京邮电大学学报 ›› 2024, Vol. 47 ›› Issue (2): 24-29.

• 论文 • 上一篇    下一篇

一种随机束搜索文本攻击黑盒算法

王小萌1,张华2,丁金扣3,王稼慧1   

  1. 1. 北京邮电大学
    2. 北京邮电大学网络与交换技术国家重点实验室
    3. 北京邮电大学理学院
  • 收稿日期:2023-05-18 修回日期:2023-07-10 出版日期:2024-04-28 发布日期:2024-01-24
  • 通讯作者: 丁金扣 E-mail:djk@bupt.edu.cn;
  • 基金资助:
    国家自然科学基金;国家自然科学基金;国家自然科学基金

A Random Beam Search Text Attack Black Box Algorithm

  • Received:2023-05-18 Revised:2023-07-10 Online:2024-04-28 Published:2024-01-24
  • Supported by:
    ;the National Natural Science Foundation of China

摘要: 针对现有对抗样本生成算法容易陷入局部最优解的问题, 提出了一种名为 R-attack 的算法,通过束搜索和随机元来提高攻击成功率。利用束搜索在同义词空间中寻找最优解,增加对抗样本的多样性,进而提高攻击的效率,同时,在迭代搜索过程中引入随机元素,避免过早陷入局部最优解,从而有效提高攻击成功率。在 3 个数据集上对 2 个模型进行了对抗攻击实验,实验结果表明,用 R-attack 算法能够有效提高对抗样本的攻击成功率。以在Yahoo! Answers 数据集上训练的双向长短期记忆网络模型为例,用 R-attack 算法攻击模型的攻击成功率比基线高了 2.4% 。

关键词: 对抗攻击算法, 自然语言处理, 黑盒攻击

Abstract: The existing adversarial attack in black box scenario aims to propose an algorithm for generating adversarial examples with a higher attack success rate, which is of great significance for studying the vulnerability of the deep learning model of natural language processing and improving the robustness of the deep learning model. To solve the problem that existing anti text generation algorithms are prone to fall into local optimal solution, this paper proposes a method to improve the attack success rate by using random element and bundle search. This method uses beam search to increase the diversity of adversarial examples, and adds random element in the iterative process of searching for adversarial examples, so as to achieve the goal of making full use of synonym space to search for the optimal solution, optimize the problems that are easily trapped in the local optimal solution in the attack process, and improve the attack success rate. Experiments have shown that the algorithm R-attack proposed in this paper can effectively improve the success rate of attacks against adversarial examples.

Key words: Adversarial attack, Natural language processing, Black box attacks

中图分类号: