Textual Adversarial Attacking with Limited Queries

Recent studies have shown that natural language processing (NLP) models are vulnerable to adversarial examples, which are maliciously designed by adding small perturbations to benign inputs that are imperceptible to the human eye, leading to false predictions by the target model. Compared to charact...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Yu Zhang, Junan Yang, Xiaoshuai Li, Hui Liu, Kun Shao
Formato: article
Lenguaje:EN
Publicado: MDPI AG 2021
Materias:
Acceso en línea:https://doaj.org/article/89ac0f34923e4dbbb9b19901a365a476
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:89ac0f34923e4dbbb9b19901a365a476
record_format dspace
spelling oai:doaj.org-article:89ac0f34923e4dbbb9b19901a365a4762021-11-11T15:39:50ZTextual Adversarial Attacking with Limited Queries10.3390/electronics102126712079-9292https://doaj.org/article/89ac0f34923e4dbbb9b19901a365a4762021-10-01T00:00:00Zhttps://www.mdpi.com/2079-9292/10/21/2671https://doaj.org/toc/2079-9292Recent studies have shown that natural language processing (NLP) models are vulnerable to adversarial examples, which are maliciously designed by adding small perturbations to benign inputs that are imperceptible to the human eye, leading to false predictions by the target model. Compared to character- and sentence-level textual adversarial attacks, word-level attack can generate higher-quality adversarial examples, especially in a black-box setting. However, existing attack methods usually require a huge number of queries to successfully deceive the target model, which is costly in a real adversarial scenario. Hence, finding appropriate models is difficult. Therefore, we propose a novel attack method, the main idea of which is to fully utilize the adversarial examples generated by the local model and transfer part of the attack to the local model to complete ahead of time, thereby reducing costs related to attacking the target model. Extensive experiments conducted on three public benchmarks show that our attack method can not only improve the success rate but also reduce the cost, while outperforming the baselines by a significant margin.Yu ZhangJunan YangXiaoshuai LiHui LiuKun ShaoMDPI AGarticlemachine learningadversarial attacknatural language processing (NLP)black boxElectronicsTK7800-8360ENElectronics, Vol 10, Iss 2671, p 2671 (2021)
institution DOAJ
collection DOAJ
language EN
topic machine learning
adversarial attack
natural language processing (NLP)
black box
Electronics
TK7800-8360
spellingShingle machine learning
adversarial attack
natural language processing (NLP)
black box
Electronics
TK7800-8360
Yu Zhang
Junan Yang
Xiaoshuai Li
Hui Liu
Kun Shao
Textual Adversarial Attacking with Limited Queries
description Recent studies have shown that natural language processing (NLP) models are vulnerable to adversarial examples, which are maliciously designed by adding small perturbations to benign inputs that are imperceptible to the human eye, leading to false predictions by the target model. Compared to character- and sentence-level textual adversarial attacks, word-level attack can generate higher-quality adversarial examples, especially in a black-box setting. However, existing attack methods usually require a huge number of queries to successfully deceive the target model, which is costly in a real adversarial scenario. Hence, finding appropriate models is difficult. Therefore, we propose a novel attack method, the main idea of which is to fully utilize the adversarial examples generated by the local model and transfer part of the attack to the local model to complete ahead of time, thereby reducing costs related to attacking the target model. Extensive experiments conducted on three public benchmarks show that our attack method can not only improve the success rate but also reduce the cost, while outperforming the baselines by a significant margin.
format article
author Yu Zhang
Junan Yang
Xiaoshuai Li
Hui Liu
Kun Shao
author_facet Yu Zhang
Junan Yang
Xiaoshuai Li
Hui Liu
Kun Shao
author_sort Yu Zhang
title Textual Adversarial Attacking with Limited Queries
title_short Textual Adversarial Attacking with Limited Queries
title_full Textual Adversarial Attacking with Limited Queries
title_fullStr Textual Adversarial Attacking with Limited Queries
title_full_unstemmed Textual Adversarial Attacking with Limited Queries
title_sort textual adversarial attacking with limited queries
publisher MDPI AG
publishDate 2021
url https://doaj.org/article/89ac0f34923e4dbbb9b19901a365a476
work_keys_str_mv AT yuzhang textualadversarialattackingwithlimitedqueries
AT junanyang textualadversarialattackingwithlimitedqueries
AT xiaoshuaili textualadversarialattackingwithlimitedqueries
AT huiliu textualadversarialattackingwithlimitedqueries
AT kunshao textualadversarialattackingwithlimitedqueries
_version_ 1718434544940482560