Multi-Targeted Adversarial Example in Evasion Attack on Deep Neural Network
Deep neural networks (DNNs) are widely used for image recognition, speech recognition, pattern analysis, and intrusion detection. Recently, the adversarial example attack, in which the input data are only slightly modified, although not an issue for human interpretation, is a serious threat to a DNN...
Guardado en:
Autores principales: | , , , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
IEEE
2018
|
Materias: | |
Acceso en línea: | https://doaj.org/article/d847cd17c9f642d58113ec58df1a3762 |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Sumario: | Deep neural networks (DNNs) are widely used for image recognition, speech recognition, pattern analysis, and intrusion detection. Recently, the adversarial example attack, in which the input data are only slightly modified, although not an issue for human interpretation, is a serious threat to a DNN as an attack as it causes the machine to misinterpret the data. The adversarial example attack has been receiving considerable attention owing to its potential threat to machine learning. It is divided into two categories: targeted adversarial example and untargeted adversarial example. The untargeted adversarial example happens when machines misclassify an object into an incorrect class. In contrast, the targeted adversarial example attack causes machines to misinterpret the image as the attacker’s desired class. Thus, the latter is a more elaborate and powerful attack than the former. The existing targeted adversarial example is a single targeted attack that allows only one class to be recognized. However, in some cases, a multi-targeted adversarial example can be useful for an attacker to make multiple models recognize a single original image as different classes. For example, an attacker can use a single road sign generated by a multi-targeted adversarial example scheme to make model A recognize it as a stop sign and model B recognize it as a left turn, whereas a human might recognize it as a right turn. Therefore, in this paper, we propose a multi-targeted adversarial example that attacks multiple models within each target class with a single modified image. To produce such examples, we carried out a transformation to maximize the probability of different target classes by multiple models. We used the MNIST datasets and TensorFlow library for our experiment. The experimental results showed that the proposed scheme for generating a multi-targeted adversarial example achieved a 100% attack success rate. |
---|