Exploiting natural language services: a polarity based black-box attack

doi:10.1007/s11704-021-0198-y

Front. Comput. Sci.

2022, Vol. 16

Issue (5) : 165325 https://doi.org/10.1007/s11704-021-0198-y

LETTER

Exploiting natural language services: a polarity based black-box attack

Fatma GUMUS^1,²(

), M. Fatih AMASYALI¹

¹. Department of Computer Engineering, Yildiz Technical University, Istanbul 34220, Turkey
². Department of Computer Engineering, Air Force Academy, National Defence University, Istanbul 34149,Turkey

Download: PDF(6900 KB) HTML
Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks

Corresponding Author(s): Fatma GUMUS

Just Accepted Date: 26 March 2021 Issue Date: 24 December 2021

Cite this article:

Fatma GUMUS,M. Fatih AMASYALI. Exploiting natural language services: a polarity based black-box attack[J]. Front. Comput. Sci., 2022, 16(5): 165325.

URL:

https://academic.hep.com.cn/fcs/EN/10.1007/s11704-021-0198-y
https://academic.hep.com.cn/fcs/EN/Y2022/V16/I5/165325

Fig.1 One iteration example of our perturbation routine for the sentiment task. The candidates Adv1 and Adv2 do not even lower the probability for the target class; therefore, only Adv3 could be chosen. For multiple available options, the sample with the maximum

P (? A d v N)

would be selected. If there were no appropriate replacements, this iteration would end with no change having been made. Since the

θ

is not yet reached, the perturbation will continue for at least another iteration. Adv4 would only be generated if the word “worst” was within the range, and the goal would be reached within a single iteration

Fig.2 Up: Polarity Attacks to Sentiment Models.

p a

: polarity attack with PT of AMAZON test set,

p i

: polarity attack with PT of IMDB test set,

p y

: polarity attack with PT of YELP test set. Down: Polarity Attacks to Categorization Models.

p t

: targeted polarity attack,

p u

: untargetted polarity attack

1	J Gao, J Lanchantin, M L Soffa, Y Qi. Black-box generation of adversarial text sequences to evade deep learning classifiers. In: Proceedings of 2018 IEEE Security and Privacy Workshops (SPW). 2018, 50– 56
2	P Vijayaraghavan, D Roy. Generating black-box adversarial examples for text classifiers using a deep reinforced model. In: Proceedings of Joint European Conference on Machine Learning and Knowledge Discovery in Databases. 2019, 711– 726
3	M Alzantot, Y Sharma, A Elgohary, B J Ho, M Srivastava, K W Chang. Generating natural language adversarial examples. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018, 2890−2896
4	X Zhang, J Zhao, Y LeCun. Character-level convolutional networks for text classification. Advances in neural information processing systems, 2015, 649– 657
5	A L Maas, R E Daly, P T Pham, D Huang, A Y Ng, C Potts. Learning word vectors for sentiment analysis. In: Proceedings of the 49th annual meeting of the association for computational linguistics: Human language technologies-volume 1. 2011, 142– 150

[1]		Download
[2]		Download

Viewed

Full text

Abstract

Cited

Shared

Discussed