Pusher: an augmented fuzzer based on the connection between input and comparison operand

doi:10.1007/s11704-021-0075-8

Front. Comput. Sci.

2022, Vol. 16

Issue (4) : 164206 https://doi.org/10.1007/s11704-021-0075-8

RESEARCH ARTICLE

Pusher: an augmented fuzzer based on the connection between input and comparison operand

Bin ZHANG, Jiaxi YE(

), Ruilin LI, Chao FENG, Yunfei SU, Chaojing TANG

College of Electronic Science, National University of Defense Technology, Changsha 410072, China

Download: PDF(18697 KB) HTML
Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks

Abstract

Coverage based fuzzing is a widespread vulnerability detection technique, and it has exposed many bugs in many real-world programs. However, its attention is to eliminate the testing on the repeated paths, yet it still employs random mutation to generate inputs, which is blind to penetrate complex comparisons in the program. As a result, the testing coverage is limited. Despite some solution proposals are presented, this problem is still partially solved. This paper argues that random mutation is mainly limited by two challenges, the sizable search spaceand the lack of a useful feedback to direct the search. Then we present an augmented fuzzing technique by addressing these two challenges. First of all, we point out a black relationship between input contents and comparison operands, which is dubbed connection. Second, we present a novel method to collect the comparison operands during execution, which is leveraged to infer the connections. Based on the connections, the fuzzer can learn about which input byte affects on which comparison instruction to establish a smaller search space. Third, the connection provides a useful feedback to direct the search. We resort to a modern meta-heuristic algorithm to satisfy this searching requirement. We developed a prototype $Pusher$ and evaluated its performance on several benchmarks and four real-world programs. The experimental results demonstrate that $Pusher$ works better than some other state-of-the-art fuzzers on bug detection, and can achieve a higher testing coverage. Moreover, we take a detailed statistic about the execution overhead in Pusher, and the results indicate that the execution overhead introduced by our approach is within an acceptable scope.

Keywords software safety software testing information security vulnerability searching

Corresponding Author(s): Jiaxi YE

Just Accepted Date: 20 January 2021 Issue Date: 01 December 2021

Cite this article:

Bin ZHANG,Jiaxi YE,Ruilin LI, et al. Pusher: an augmented fuzzer based on the connection between input and comparison operand[J]. Front. Comput. Sci., 2022, 16(4): 164206.

URL:

https://academic.hep.com.cn/fcs/EN/10.1007/s11704-021-0075-8
https://academic.hep.com.cn/fcs/EN/Y2022/V16/I4/164206

Fig.1 An example containing two representative path constraints at lines 8 and 19

Fig.2 The workflow of our approach

Form	Example	Reconstructed	Target value
$cmp$ imm, imm	cmp 5, 7	N/A	N/A
$cmp$ reg/mem, imm	cmp eax, 7	N/A	7
$cmp$ reg/mem, reg/mem	cmp eax, ebx	cmp eax-ebx, 0	0

Tab.1 The reconstruction of the

cmp

instructions

Fig.3

Fig.4 An example to show the process of the connection inference

Fig.5

Fig.6

Tab.2 The evaluation experiments in other fuzzers

Fig.7 Each bug owns a unique ID

Program	Listed bugs	FUZZER	SES	AFL	AFL-lafintel	Angora	$Pusher$
base64	44	7	9	9	28	44+4	44+4
md5sum	57	2	0	0	0	57	57+4
uniq	28	7	0	0	24	28+1	28+1
who	2136	0	18	1	2	1443+98	1847+205

Tab.3 The number of detected bugs on LAVA-M database

Program	Size	AFL		AFLFast	AFL-lafintel		FairFuzz		$Pusher$
CGC_File_System	88 KB	7.2%	0	7.2%	59.5%	0	7.2%	0	75.1%	10
CGC_Hangman_Game	5.2 MB	12%	0	12%	12%	0	12%	0	94%	1
CGC_Image_Parser	145 KB	4.6%	0	4.6%	4.6%	0	4.6%	0	60.6%	12
CGC_Video_Format_Parser_and_Viewer	82 KB	12%	0	12%	53.1%	0	12%	0	53.1%	0
CNMP	56 KB	27.5%	0	27.5%	42.5%	26	47.5%	3	80%	30
FASTLANE	52 KB	25.9%	0	17.5%	17.5%	0	27.7%	0	51.2%	19
Gridder	164 KB	84.2%	10	70.3%	82.9%	23	84.2%	28	84.2%	56
Griswold	95 KB	3.8%	0	3.8%	45%	13	3.8%	0	55.3%	27
Barcoder	172 KB	59%	0	59%	57.2%	0	59%	0	68%	25

Tab.4 The experimental results on CGC binaries

Fig.8 The validate_bmp_headers function implementation in the Barcoder program

Program	AFL	AFL-lafintel	FairFuzz	Angora	$Pusher$
jhead	0	0	0	16	114
nm	77	69	27	3	83
objdump	5	22	29	34	51
tcpdump	0	0	0	0	3
total	82	91	56	53	251

Tab.5 The number of detected bugs on real-world programs

Program	AFL			AFLFast			AFL-lafintel			Fairfuzz			Angora			$Pusher$
Program	BC	LC	FC	BC	LC	FC	BC	LC	FC	BC	LC	FC	BC	LC	FC	BC	LC	FC
jhead	213	313	18	213	313	18	213	313	18	213	313	18	479	687	26	568	714	26
jhead				0%	0%	0%	0%	0%	0%	0%	0%	0%	+125%	+119%	+44%	+167%	+128%	+44%
nm	4397	7097	391	4342	6993	390	4046	6856	405	4468	7230	392	4002	6679	404	4710	7851	433
nm				?1%	?1%	0%	?8%	?3%	+4%	+2%	+2%	0%	?9%	?6%	+3%	+7%	+11%	+11%
objdump	2602	4375	273	2378	3949	258	2694	4990	309	2595	4426	278	3337	6329	358	3737	7034	372
objdump				?9%	?10%	?5%	+4%	+14%	+13%	0%	+1%	+2%	+28%	+45%	+31%	+44%	+61%	+36%
tcpdump	15474	23452	706	13501	20645	682	12791	20475	698	16264	24530	753	2482	4834	269	18754	27861	846
tcpdump				?13%	?12%	?3%	?17%	?13%	?1%	+5%	+5%	+7%	?84%	?81%	?62%	+21%	+19%	+20

Tab.6 The final testing coverage after 72 hours of testing

Fig.9 The increasing process of branch coverage number in the 72 hours. (a) jhead; (b) nm; (c) objdump; (d) tcpdump

Fig.10 The code snippet in the jhead program

	Overhead	AFL	$Pusher$
Instrumentation	coverage collection	√	√
Instrumentation	comparison operands collection		√
Analysis	testing schedule	√	√
	connection inference		√
	GA based searching		√

Tab.7 The overhead in AFL and

Pusher

Fig.11 The final execution numbers. (a) jhead; (b) nm; (c) objdump; (d) tcpdump

Fig.12 The execution distribution in

Pusher

between coverage-based testing and search-based testing

Fig.13 The code coverage of tcpdump for vanilla AFL (left part) and

Pusher

(right part)

Tab.8 The execution distribution in

Pusher

between connection inference and GA searching

1	B P Miller , L Fredriksen , B So . An empirical study of the reliability of UNIX utilities. Communications of the ACM, 1990, 33( 12): 32– 44
2	H Liang , X Pei , X Jia , W Shen , J Zhang . Fuzzing: state of the art. IEEE Transactions on Reliability, 2018, 67( 3): 1199– 1218
3	K Serebryany. Continuous fuzzing with libfuzzer and addresssanitizer. In: Proceedings of IEEE Cybersecurity Development. 2016, 157– 157
4	S Gan, C Zhang, X Qin, X Tu, K Li, Z Pei, Z Chen. CollAFL: path sensitive fuzzing. In: Proceedings of IEEE Symposium on Security and Privacy. 2018, 679−696
5	L Demoura, N Bjørner. Z3: An efficient SMT solver. In: Proceedings of Tools and Algorithms for the Construction and Analysis of Systems. 2008, 337– 340
6	N Stephens, J Grosen, C Salls, A Dutcher, R Wang, J Corbetta, Y Shoshitaishvili, C Kruegel, G Vigna. Driller: augmenting fuzzing through selective symbolic execution. In: Proceedings of Network and Distributed System Security Symposium. 2016
7	L Zhao, Y Duan, H Yin, J Xuan. Send hardest problems my way: probabilistic path prioritization for hybrid fuzzing. In: Proceedings 2019 Network and Distributed System Security Symposium. 2019
8	B S Pak. Hybrid fuzz testing: discovering software bugs via fuzzing and symbolic execution. PhD thesis, Carnegie Mellon University Pittsburgh, PA, 2012
9	R Baldoni , E Coppa , D C Doelia , C Demetrescu , I Finocchi . A survey of symbolic execution techniques. Journal of ACM Computer Survey, 2018, 51( 3): 1– 39
10	H Peng, Y Shoshitaishvili, M Payer. T-Fuzz: fuzzing by program transformation. In: Proceedings of IEEE Symposium on Security and Privacy. 2018, 697– 710
11	Newsome J, Song D X. Dynamic taint analysis for automatic detection, analysis, and signature generation of exploits on commodity software. In: Proceedings of the Network and Distributed System Security Symposium. 2005
12	S Rawat, V Jain, A Kumar, L Cojocar, C Giuffrida, H Bos. VUzzer: application-aware evolutionary fuzzing. In: Proceedings of Network and Distributed System Security Symposium. 2017
13	Dowser: A guided fuzzer to find buffer overflow vulnerabilities. In: Proceedings of the USENIX Security Symposium
14	P Chen, H Chen. Angora: efficient fuzzing by principled search. In: Proceedings of IEEE Symposium on Security and Privacy. 2018, 711– 725
15	Y Li, B Chen, M Chandramohan, S W Lin, Y Liu, A Tiu. Steelix: program-state based binary fuzzing. In: Proceedings of the Joint Meeting on Foundations of Software Engineering. 2017, 627– 637
16	J Ye , B Zhang , R Li , C Feng , C Tang . Program state sensitive parallel fuzzing for real world software. IEEE Access, 2019, 7 : 42557– 42564
17	M Böhme, V T Pham, A Roychoudhury. Coveragebased greybox fuzzing As Markov chain. In: Proceedings of ACM SIGSAC Conference on Computer and Communications Security. 2016, 1032−1043
18	C Lemieux, K Sen. FairFuzz: a targeted mutation strategy for increasing greybox fuzz testing coverage. In: Proceedings of ACM/IEEE International Conference on Automated Software Engineering. 2018, 475– 485
19	M Dave, R Agrawal. Search based techniques and mutation analysis in automatic test case generation: a survey. In: Proceedings of IEEE International Advance Computing Conference. 2015, 795– 799
20	M Harman, Y Jia, Y Zhang. Achievements, open problems and challenges for search based software testing. In: Proceedings of IEEE International Conference on Software Testing, Verification and Validation. 2015, 1– 12
21	G Fraser, A Arcuri. EvoSuite: automatic test suite generation for object-oriented software. In: Proceedings of ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering. 2011, 416– 419
22	J E Rowe. Genetic algorithm theory. In: Proceedings of Conference Companion on Genetic and Evolutionary Computation. 2007, 3585
23	Dolan-Gavitt B, Hulin P, Kirda E, Leek T, Mambretti A, Robertson W, Ulrich F, Whelan R. LAVA: large-scale automated vulnerability addition. In: Proceedings of IEEE Symposium on Security and Privacy. 2016, 110–121
24	C Lattner, V Adve. LLVM: a compilation framework for lifelong program analysis & transformation. In: Proceedings of IEEE International Symposium on Code Generation and Optimization. 2004, 75– 86
25	J Liang, Y Jiang, Y Chen, M Wang, C Zhou, J Sun. PAFL: extend fuzzing optimizations of single mode to industrial parallel mode. In: Proceedings of ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 2018, 809– 814
26	K Serebryany, D Bruening, A Potapenko, D Vyukov. AddressSanitizer: a fast address sanity checker. In: Proceedings of USENIX Annual Technical Conference. 2012, 309– 318
27	M Security. fuzzdata: fuzzing resources for feeding various fuzzers with input. Mozilla Security, December 2017
28	C Aschermann, S Schumilo, T Blazytko, R Gawlik, T Holz. REDQUEEN: fuzzing with input-to-state correspondence. In: Proceedings of Annual Network and Distributed System Security Symposium. 2019
29	K Böttinger, C Eckert. Deepfuzz: triggering vulnerabilities deeply hidden in binaries. In: Proceedings of International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment. 2016, 25– 34
30	M Wang, J Liang, Y Chen, Y Jiang, X Jiao, H Liu, X Zhao, J Sun. Safl: increasing and accelerating testing coverage with symbolic execution and guided fuzzing. In: Proceedings of International Conference on Software Engineering: Companion Proceeedings. 2018
31	M Cho, S Kim, T Kwon. Intriguer: field-level constraint solving for hybrid fuzzing. In: Proceedings of ACM SIGSAC Conference on Computer and Communications Security. 2019, 515– 530
32	W Gong, G Zhang, X Zhou. Learn to accelerate identifying new test cases in fuzzing. In: Proceeding of Security, Privacy, and Anonymity in Computation, Communication, and Storage. 2017, 298– 307
33	Y Wang , Z Wu , Q Wei , Q Wang . Neufuzz: efficient fuzzing with deep neural network. IEEE Access, 2019, 7 : 36340– 36352
34	D She, K Pei, D Epstein, J Yang, B Ray, S Jana. NEUZZ: efficient fuzzing with neural program smoothing. In: Proceedings of IEEE Symposium on Security and Privacy. 2019, 803– 817
35	T Wang, T Wei, G Gu, W Zou. Taintscope: a checksumaware directed fuzzing tool for automatic software vulnerability detection. In: Proceedings of IEEE Symposium on Security and Privacy. 2010, 497– 512
36	X Liu , Q Wei , Q Wang , Z Zhao , Z Yin . Cafa: a checksum-aware fuzzing assistant tool for coverage improvement. Journal of Security and Communication Networks, 2018, 2018 : 1– 13

[1]

Download

[1]	Zeli WANG, Hai JIN, Weiqi DAI, Kim-Kwang Raymond CHOO, Deqing ZOU. Ethereum smart contract security research: survey and future research opportunities[J]. Front. Comput. Sci., 2021, 15(2): 152802-.
[2]	Xiaofang QI,Jun HE,Peng WANG,Huayang ZHOU. Variable strength combinatorial testing of concurrent programs[J]. Front. Comput. Sci., 2016, 10(4): 631-643.
[3]	Priyanka CHAWLA,Inderveer CHANA,Ajay RANA. A novel strategy for automatic test data generation using soft computing technique[J]. Front. Comput. Sci., 2015, 9(3): 346-363.
[4]	Yan ZHANG,Dunwei GONG. Generating test data for both paths coverage and faults detection using genetic algorithms: multi-path case[J]. Front. Comput. Sci., 2014, 8(5): 726-740.
[5]	Chang-ai SUN,Zuoyi WANG,Guan WANG. A property-based testing framework for encryption programs[J]. Front. Comput. Sci., 2014, 8(3): 478-489.
[6]	Dunwei GONG, Yan ZHANG. Generating test data for both path coverage and fault detection using genetic algorithms[J]. Front Comput Sci, 2013, 7(6): 822-837.
[7]	Jialu HU , Lin GAO , Guimin QIN , . Evaluation of subgraph searching algorithms for detecting network motifs in biological networks[J]. Front. Comput. Sci., 2009, 3(3): 412-416.

Viewed

Full text

Abstract

Cited

Shared

Discussed