Please wait a minute...
Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

Postal Subscription Code 80-970

2018 Impact Factor: 1.129

Front. Comput. Sci.    2022, Vol. 16 Issue (4) : 164206    https://doi.org/10.1007/s11704-021-0075-8
RESEARCH ARTICLE
Pusher: an augmented fuzzer based on the connection between input and comparison operand
Bin ZHANG, Jiaxi YE(), Ruilin LI, Chao FENG, Yunfei SU, Chaojing TANG
College of Electronic Science, National University of Defense Technology, Changsha 410072, China
 Download: PDF(18697 KB)   HTML
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

Coverage based fuzzing is a widespread vulnerability detection technique, and it has exposed many bugs in many real-world programs. However, its attention is to eliminate the testing on the repeated paths, yet it still employs random mutation to generate inputs, which is blind to penetrate complex comparisons in the program. As a result, the testing coverage is limited. Despite some solution proposals are presented, this problem is still partially solved. This paper argues that random mutation is mainly limited by two challenges, the sizable search spaceand the lack of a useful feedback to direct the search. Then we present an augmented fuzzing technique by addressing these two challenges. First of all, we point out a black relationship between input contents and comparison operands, which is dubbed connection. Second, we present a novel method to collect the comparison operands during execution, which is leveraged to infer the connections. Based on the connections, the fuzzer can learn about which input byte affects on which comparison instruction to establish a smaller search space. Third, the connection provides a useful feedback to direct the search. We resort to a modern meta-heuristic algorithm to satisfy this searching requirement. We developed a prototype Pusher and evaluated its performance on several benchmarks and four real-world programs. The experimental results demonstrate that Pusher works better than some other state-of-the-art fuzzers on bug detection, and can achieve a higher testing coverage. Moreover, we take a detailed statistic about the execution overhead in Pusher, and the results indicate that the execution overhead introduced by our approach is within an acceptable scope.

Keywords software safety      software testing      information security      vulnerability      searching     
Corresponding Author(s): Jiaxi YE   
Just Accepted Date: 20 January 2021   Issue Date: 01 December 2021
 Cite this article:   
Bin ZHANG,Jiaxi YE,Ruilin LI, et al. Pusher: an augmented fuzzer based on the connection between input and comparison operand[J]. Front. Comput. Sci., 2022, 16(4): 164206.
 URL:  
https://academic.hep.com.cn/fcs/EN/10.1007/s11704-021-0075-8
https://academic.hep.com.cn/fcs/EN/Y2022/V16/I4/164206
Fig.1  An example containing two representative path constraints at lines 8 and 19
Fig.2  The workflow of our approach
Form Example Reconstructed Target value
cmp imm, imm cmp 5, 7 N/A N/A
cmp reg/mem, imm cmp eax, 7 N/A 7
cmp reg/mem, reg/mem cmp eax, ebx cmp eax-ebx, 0 0
Tab.1  The reconstruction of the cmp instructions
Fig.3  
Fig.4  An example to show the process of the connection inference
Fig.5  
Fig.6  
Fuzzer Time Benchmark Compared Fuzzer
Steelix 2017 RWP AFL-dyninst
CGC AFL-dyninst
LAVA-M FUZZER, SES, VUzzer, AFL-lafintel
VUzzer 2017 RWP AFL
CGC AFLPIN
LAVA-M FUZZER, SES
Angora 2018 RWP AFL
CGC ?
LAVA-M FUZZER, SES, AFL, VUzzer, Steelix
T-fuzz 2018 RWP AFL
CGC AFL, Driller
LAVA-M FUZZER, SES, VUzzer, Steelix
Profuzzer 2019 RWP AFL, AFLFast, Driller, Vuzzer, QSYM
CGC ?
LAVA-M AFL, AFLFast, Driller, VUzzer, Angora, QSYM
Tab.2  The evaluation experiments in other fuzzers
Fig.7  Each bug owns a unique ID
Program Listed bugs FUZZER SES AFL AFL-lafintel Angora Pusher
base64 44 7 9 9 28 44+4 44+4
md5sum 57 2 0 0 0 57 57+4
uniq 28 7 0 0 24 28+1 28+1
who 2136 0 18 1 2 1443+98 1847+205
Tab.3  The number of detected bugs on LAVA-M database
Program Size AFL AFLFast AFL-lafintel FairFuzz Pusher
CGC_File_System 88 KB 7.2% 0 7.2% 0 59.5% 0 7.2% 0 75.1% 10
CGC_Hangman_Game 5.2 MB 12% 0 12% 0 12% 0 12% 0 94% 1
CGC_Image_Parser 145 KB 4.6% 0 4.6% 0 4.6% 0 4.6% 0 60.6% 12
CGC_Video_Format_Parser_and_Viewer 82 KB 12% 0 12% 0 53.1% 0 12% 0 53.1% 0
CNMP 56 KB 27.5% 0 27.5% 0 42.5% 26 47.5% 3 80% 30
FASTLANE 52 KB 25.9% 0 17.5% 0 17.5% 0 27.7% 0 51.2% 19
Gridder 164 KB 84.2% 10 70.3% 0 82.9% 23 84.2% 28 84.2% 56
Griswold 95 KB 3.8% 0 3.8% 0 45% 13 3.8% 0 55.3% 27
Barcoder 172 KB 59% 0 59% 0 57.2% 0 59% 0 68% 25
Tab.4  The experimental results on CGC binaries
Fig.8  The validate_bmp_headers function implementation in the Barcoder program
Program AFL AFL-lafintel FairFuzz Angora Pusher
jhead 0 0 0 16 114
nm 77 69 27 3 83
objdump 5 22 29 34 51
tcpdump 0 0 0 0 3
total 82 91 56 53 251
Tab.5  The number of detected bugs on real-world programs
Program AFL AFLFast AFL-lafintel Fairfuzz Angora Pusher
BC LC FC BC LC FC BC LC FC BC LC FC BC LC FC BC LC FC
jhead 213 313 18 213 313 18 213 313 18 213 313 18 479 687 26 568 714 26
0% 0% 0% 0% 0% 0% 0% 0% 0% +125% +119% +44% +167% +128% +44%
nm 4397 7097 391 4342 6993 390 4046 6856 405 4468 7230 392 4002 6679 404 4710 7851 433
?1% ?1% 0% ?8% ?3% +4% +2% +2% 0% ?9% ?6% +3% +7% +11% +11%
objdump 2602 4375 273 2378 3949 258 2694 4990 309 2595 4426 278 3337 6329 358 3737 7034 372
?9% ?10% ?5% +4% +14% +13% 0% +1% +2% +28% +45% +31% +44% +61% +36%
tcpdump 15474 23452 706 13501 20645 682 12791 20475 698 16264 24530 753 2482 4834 269 18754 27861 846
?13% ?12% ?3% ?17% ?13% ?1% +5% +5% +7% ?84% ?81% ?62% +21% +19% +20
Tab.6  The final testing coverage after 72 hours of testing
Fig.9  The increasing process of branch coverage number in the 72 hours. (a) jhead; (b) nm; (c) objdump; (d) tcpdump
Fig.10  The code snippet in the jhead program
Overhead AFL Pusher
Instrumentation coverage collection
comparison operands collection
Analysis testing schedule
connection inference
GA based searching
Tab.7  The overhead in AFL and Pusher
Fig.11  The final execution numbers. (a) jhead; (b) nm; (c) objdump; (d) tcpdump
Fig.12  The execution distribution in Pusher between coverage-based testing and search-based testing
Fig.13  The code coverage of tcpdump for vanilla AFL (left part) and Pusher (right part)
Program connection inference/# GA based searching/#
jhead 4748k 41k
nm 8352k 1720k
objdump 8842k 969k
tcpdump 81659k 1016k
Tab.8  The execution distribution in Pusher between connection inference and GA searching
1 B P Miller , L Fredriksen , B So . An empirical study of the reliability of UNIX utilities. Communications of the ACM, 1990, 33( 12): 32– 44
2 H Liang , X Pei , X Jia , W Shen , J Zhang . Fuzzing: state of the art. IEEE Transactions on Reliability, 2018, 67( 3): 1199– 1218
3 K Serebryany. Continuous fuzzing with libfuzzer and addresssanitizer. In: Proceedings of IEEE Cybersecurity Development. 2016, 157– 157
4 S Gan, C Zhang, X Qin, X Tu, K Li, Z Pei, Z Chen. CollAFL: path sensitive fuzzing. In: Proceedings of IEEE Symposium on Security and Privacy. 2018, 679−696
5 L Demoura, N Bjørner. Z3: An efficient SMT solver. In: Proceedings of Tools and Algorithms for the Construction and Analysis of Systems. 2008, 337– 340
6 N Stephens, J Grosen, C Salls, A Dutcher, R Wang, J Corbetta, Y Shoshitaishvili, C Kruegel, G Vigna. Driller: augmenting fuzzing through selective symbolic execution. In: Proceedings of Network and Distributed System Security Symposium. 2016
7 L Zhao, Y Duan, H Yin, J Xuan. Send hardest problems my way: probabilistic path prioritization for hybrid fuzzing. In: Proceedings 2019 Network and Distributed System Security Symposium. 2019
8 B S Pak. Hybrid fuzz testing: discovering software bugs via fuzzing and symbolic execution. PhD thesis, Carnegie Mellon University Pittsburgh, PA, 2012
9 R Baldoni , E Coppa , D C Doelia , C Demetrescu , I Finocchi . A survey of symbolic execution techniques. Journal of ACM Computer Survey, 2018, 51( 3): 1– 39
10 H Peng, Y Shoshitaishvili, M Payer. T-Fuzz: fuzzing by program transformation. In: Proceedings of IEEE Symposium on Security and Privacy. 2018, 697– 710
11 Newsome J, Song D X. Dynamic taint analysis for automatic detection, analysis, and signature generation of exploits on commodity software. In: Proceedings of the Network and Distributed System Security Symposium. 2005
12 S Rawat, V Jain, A Kumar, L Cojocar, C Giuffrida, H Bos. VUzzer: application-aware evolutionary fuzzing. In: Proceedings of Network and Distributed System Security Symposium. 2017
13 Dowser: A guided fuzzer to find buffer overflow vulnerabilities. In: Proceedings of the USENIX Security Symposium
14 P Chen, H Chen. Angora: efficient fuzzing by principled search. In: Proceedings of IEEE Symposium on Security and Privacy. 2018, 711– 725
15 Y Li, B Chen, M Chandramohan, S W Lin, Y Liu, A Tiu. Steelix: program-state based binary fuzzing. In: Proceedings of the Joint Meeting on Foundations of Software Engineering. 2017, 627– 637
16 J Ye , B Zhang , R Li , C Feng , C Tang . Program state sensitive parallel fuzzing for real world software. IEEE Access, 2019, 7 : 42557– 42564
17 M Böhme, V T Pham, A Roychoudhury. Coveragebased greybox fuzzing As Markov chain. In: Proceedings of ACM SIGSAC Conference on Computer and Communications Security. 2016, 1032−1043
18 C Lemieux, K Sen. FairFuzz: a targeted mutation strategy for increasing greybox fuzz testing coverage. In: Proceedings of ACM/IEEE International Conference on Automated Software Engineering. 2018, 475– 485
19 M Dave, R Agrawal. Search based techniques and mutation analysis in automatic test case generation: a survey. In: Proceedings of IEEE International Advance Computing Conference. 2015, 795– 799
20 M Harman, Y Jia, Y Zhang. Achievements, open problems and challenges for search based software testing. In: Proceedings of IEEE International Conference on Software Testing, Verification and Validation. 2015, 1– 12
21 G Fraser, A Arcuri. EvoSuite: automatic test suite generation for object-oriented software. In: Proceedings of ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering. 2011, 416– 419
22 J E Rowe. Genetic algorithm theory. In: Proceedings of Conference Companion on Genetic and Evolutionary Computation. 2007, 3585
23 Dolan-Gavitt B, Hulin P, Kirda E, Leek T, Mambretti A, Robertson W, Ulrich F, Whelan R. LAVA: large-scale automated vulnerability addition. In: Proceedings of IEEE Symposium on Security and Privacy. 2016, 110–121
24 C Lattner, V Adve. LLVM: a compilation framework for lifelong program analysis & transformation. In: Proceedings of IEEE International Symposium on Code Generation and Optimization. 2004, 75– 86
25 J Liang, Y Jiang, Y Chen, M Wang, C Zhou, J Sun. PAFL: extend fuzzing optimizations of single mode to industrial parallel mode. In: Proceedings of ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 2018, 809– 814
26 K Serebryany, D Bruening, A Potapenko, D Vyukov. AddressSanitizer: a fast address sanity checker. In: Proceedings of USENIX Annual Technical Conference. 2012, 309– 318
27 M Security. fuzzdata: fuzzing resources for feeding various fuzzers with input. Mozilla Security, December 2017
28 C Aschermann, S Schumilo, T Blazytko, R Gawlik, T Holz. REDQUEEN: fuzzing with input-to-state correspondence. In: Proceedings of Annual Network and Distributed System Security Symposium. 2019
29 K Böttinger, C Eckert. Deepfuzz: triggering vulnerabilities deeply hidden in binaries. In: Proceedings of International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment. 2016, 25– 34
30 M Wang, J Liang, Y Chen, Y Jiang, X Jiao, H Liu, X Zhao, J Sun. Safl: increasing and accelerating testing coverage with symbolic execution and guided fuzzing. In: Proceedings of International Conference on Software Engineering: Companion Proceeedings. 2018
31 M Cho, S Kim, T Kwon. Intriguer: field-level constraint solving for hybrid fuzzing. In: Proceedings of ACM SIGSAC Conference on Computer and Communications Security. 2019, 515– 530
32 W Gong, G Zhang, X Zhou. Learn to accelerate identifying new test cases in fuzzing. In: Proceeding of Security, Privacy, and Anonymity in Computation, Communication, and Storage. 2017, 298– 307
33 Y Wang , Z Wu , Q Wei , Q Wang . Neufuzz: efficient fuzzing with deep neural network. IEEE Access, 2019, 7 : 36340– 36352
34 D She, K Pei, D Epstein, J Yang, B Ray, S Jana. NEUZZ: efficient fuzzing with neural program smoothing. In: Proceedings of IEEE Symposium on Security and Privacy. 2019, 803– 817
35 T Wang, T Wei, G Gu, W Zou. Taintscope: a checksumaware directed fuzzing tool for automatic software vulnerability detection. In: Proceedings of IEEE Symposium on Security and Privacy. 2010, 497– 512
36 X Liu , Q Wei , Q Wang , Z Zhao , Z Yin . Cafa: a checksum-aware fuzzing assistant tool for coverage improvement. Journal of Security and Communication Networks, 2018, 2018 : 1– 13
[1] Zeli WANG, Hai JIN, Weiqi DAI, Kim-Kwang Raymond CHOO, Deqing ZOU. Ethereum smart contract security research: survey and future research opportunities[J]. Front. Comput. Sci., 2021, 15(2): 152802-.
[2] Xiaofang QI,Jun HE,Peng WANG,Huayang ZHOU. Variable strength combinatorial testing of concurrent programs[J]. Front. Comput. Sci., 2016, 10(4): 631-643.
[3] Priyanka CHAWLA,Inderveer CHANA,Ajay RANA. A novel strategy for automatic test data generation using soft computing technique[J]. Front. Comput. Sci., 2015, 9(3): 346-363.
[4] Yan ZHANG,Dunwei GONG. Generating test data for both paths coverage and faults detection using genetic algorithms: multi-path case[J]. Front. Comput. Sci., 2014, 8(5): 726-740.
[5] Chang-ai SUN,Zuoyi WANG,Guan WANG. A property-based testing framework for encryption programs[J]. Front. Comput. Sci., 2014, 8(3): 478-489.
[6] Dunwei GONG, Yan ZHANG. Generating test data for both path coverage and fault detection using genetic algorithms[J]. Front Comput Sci, 2013, 7(6): 822-837.
[7] Jialu HU , Lin GAO , Guimin QIN , . Evaluation of subgraph searching algorithms for detecting network motifs in biological networks[J]. Front. Comput. Sci., 2009, 3(3): 412-416.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed