Compiler testing: a systematic literature analysis
Yixuan TANG1, Zhilei REN1, Weiqiang KONG1, He JIANG1,2,3()
1. School of Software, Dalian University of Technology, Dalian 116024, China 2. Key Laboratory for Ubiquitous Network and Service Software of Liaoning Province, Dalian 116000, China 3. School of Computer Science & Technology, Beijing Institute of Technology, Beijing 100081, China
Compilers are widely-used infrastructures in accelerating the software development, and expected to be trustworthy. In the literature, various testing technologies have been proposed to guarantee the quality of compilers. However, there remains an obstacle to comprehensively characterize and understand compiler testing. To overcome this obstacle, we propose a literature analysis framework to gain insights into the compiler testing area. First, we perform an extensive search to construct a dataset related to compiler testing papers. Then, we conduct a bibliometric analysis to analyze the productive authors, the influential papers, and the frequently tested compilers based on our dataset. Finally, we utilize association rules and collaboration networks to mine the authorships and the communities of interests among researchers and keywords. Some valuable results are reported. We find that the USA is the leading country that contains the most influential researchers and institutions. The most active keyword is “random testing”. We also find that most researchers have broad interests within small-scale collaborators in the compiler testing area.
T Pearse, P Oman. Maintainability measurements on industrial source code maintenance activities. In: Proceedings of the International Conference on Software Maintenance. 1995, 295–303 https://doi.org/10.1109/ICSM.1995.526551
3
C Sun, V Le, Q Zhang, Z Su. Toward understanding compiler bugs in GCC and LLVM. In: Proceedings of the 25th International Symposium on Software Testing and Analysis. 2016, 294–305 https://doi.org/10.1145/2931037.2931074
4
C Sun, V Le, Z Su. Finding and analyzing compiler warning defects. In: Proceedings of the 38th IEEE/ACM International Conference on Software Engineering. 2016, 203–213 https://doi.org/10.1145/2884781.2884879
5
V Le, M Afshari, Z Su. Compiler validation via equivalence modulo inputs. In: Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation. 2014, 216–226 https://doi.org/10.1145/2666356.2594334
6
X Yang, Y Chen, E Eide, J Regehr. Finding and understanding bugs in C compilers. In: Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation. 2011, 283–294 https://doi.org/10.1145/1993498.1993532
7
J Chen, W Hu, D Hao, Y Xiong, H Zhang, Z Lu, B Xie. An empirical comparison of compiler testing techniques. In: Proceedings of the 38th IEEE/ACM International Conference on Software Engineering. 2016, 180–190 https://doi.org/10.1145/2884781.2884878
8
C Lidbury, A Lascu, N Chong, A F Donaldson. Many-core compiler fuzzing. In: Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation. 2015, 65–76 https://doi.org/10.1145/2737924.2737986
9
F Sheridan. Practical testing of a C99 compiler using output comparison. Software: Practice and Experience, 2007, 37(14): 1475–1488 https://doi.org/10.1002/spe.812
10
E Nagai, A Hashimoto, N Ishiura. Reinforcing random testing of arithmetic optimization of C compilers by scaling up size and number of expressions. IPSJ Transactions on System LSI Design Methodology, 2014, 7(4): 91–100 https://doi.org/10.2197/ipsjtsldm.7.91
11
Y Chen, A Groce, C Zhang, W K Wong, X Fern, E Eide, J Regehr . Taming compiler fuzzers. In: Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation. 2013, 197–208 https://doi.org/10.1145/2491956.2462173
12
J Regehr, Y Chen, P Cuoq, E Eide, C Ellison, X Yang. Test-case reduction for C compiler bugs. In: Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation. 2012, 335–346 https://doi.org/10.1145/2254064.2254104
13
C Lindig. Find a compiler bug in 5 minutes. British Journal of Ophthalmology, 2005, 79(4): 387–396
14
C Lindig. Random testing of C calling conventions. In: Proceedings of the 6th International Symposium on Automated Analysis-Driven Debugging. 2005, 3–12 https://doi.org/10.1145/1085130.1085132
15
E Eide, J Regehr. Volatiles are miscompiled, and what to do about it. In: Proceedings of the 8th ACM International Conference on Embedded Software. 2008, 255–264 https://doi.org/10.1145/1450058.1450093
16
C Zhao , Y Xue, Q Tao, L Guo, Z Wang. Automated test program generation for an industrial optimizing compiler. In: Proceedings of ICSE Workshop on Automation of Software Test. 2009, 36–43
17
W M McKeeman. Differential testing for software. Digital Technical Journal, 1998, 10(1): 100–107
18
V Le, C Sun, Z Su. Randomized stress-testing of link-time optimizers. In: Proceedings of the 2015 International Symposium on Software Testing and Analysis. 2015, 327–337 https://doi.org/10.1145/2771783.2771785
19
F Hariri, A Shi, H Converse, S Khurshid, D Marinov. Evaluating the effects of compiler optimizations on mutation testing at the compiler ir level. In: Proceedings of the 27th IEEE International Symposium on Software Reliability Engineering. 2016, 105–115 https://doi.org/10.1109/ISSRE.2016.51
20
Q Tao, W Wu, C Zhao, W Shen. An automatic testing approach for compiler based on metamorphic testing technique. In: Proceedings of the 17th Asia Pacific Software Engineering Conference. 2010, 270–279 https://doi.org/10.1109/APSEC.2010.39
21
A F Donaldson, A Lascu. Metamorphic testing for (graphics) compilers. In: Proceedings of the 1st International Workshop on Metamorphic Testing. 2016, 44–47 https://doi.org/10.1145/2896971.2896978
22
M Pflanzer, A F Donaldson, A Lascu. Automatic test case reduction for opencl. In: Proceedings of the 4th International Workshop on OpenCL. 2016, 1–12 https://doi.org/10.1145/2909437.2909439
23
Z Ren, H Jiang, J Xuan, Z Yang. Automated localization for unreproducible builds. In: Proceedings of the 40th International Conference on Software Engineering. 2018, 71–81 https://doi.org/10.1145/3180155.3180224
24
H Jiang, X Li, Z Yang, J Xuan. What causes my test alarm? Automatic cause analysis for test alarms in system and integration testing. In: Proceedings of the 39th International Conference on Software Engineering. 2017, 712–723 https://doi.org/10.1109/ICSE.2017.71
25
A Celentano, S C Reghizzi, P D Vigna, C Ghezzi , G Granata, F Savoretti. Compiler testing using a sentence generator. Software: Practice and Experience, 1980, 10(11): 897–918 https://doi.org/10.1002/spe.4380101104
26
A S Boujarwah, K Saleh, J Al-Dallal . Testing syntax and semantic coverage of Java language compilers. Information and Software Technology, 1999, 41(1): 15–28 https://doi.org/10.1016/S0950-5849(98)00075-5
27
H S Chae, G Woo, T Y Kim, J H Bae, W Y Kim. An automated approach to reducing test suites for testing retargeted C compilers for embedded systems. Journal of Systems and Software, 2011, 84(12): 2053–2064 https://doi.org/10.1016/j.jss.2011.04.023
28
M Y Wu, G C Fox. A test suite approach for Fortran90D compilers on MIMD distributed memory parallel computers. In: Proceedings of Scalable High Performance Computing Conference. 1992, 393–400 https://doi.org/10.1109/SHPCC.1992.232667
29
A Kalinov, A Kossatchev, M Posypkin, V Shishkov. Using ASM specification for automatic test suite generation for mpC parallel programming language compiler. In: Proceedings of the 4th International Workshop on Action Semantic. 2002, 99–109
30
Q Zhang, C Sun, Z Su. Skeletal program enumeration for rigorous compiler testing. In: Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation. 2017, 347–361 https://doi.org/10.1145/3062341.3062379
31
E T Barr, M Harman, P McMinn, M Shahbaz, S Yoo. The oracle problem in software testing: a survey. IEEE Transactions on Software Engineering, 2015, 41(5): 507–525 https://doi.org/10.1109/TSE.2014.2372785
W Kong, L Liu, T Ando, H Yatsu, K Hisazumi, A Fukuda. Facilitating multicore bounded model checking with stateless explicit-state exploration. The Computer Journal, 2014, 58(11): 2824–2840 https://doi.org/10.1093/comjnl/bxu127
34
V Le, C Sun, Z Su. Finding deep compiler bugs via guided stochastic program mutation. In: Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming Systems, Languages, and Applications. 2015, 50(10): 386–399 https://doi.org/10.1145/2814270.2814319
35
C Sun, V Le, Z Su. Finding compiler bugs via live code mutation. In: Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming Systems, Languages, and Applications. 2016, 849–863 https://doi.org/10.1145/2983990.2984038
36
H Mei, D Hao, L Zhang, L Zhang, J Zhou, G Rothermel. A static approach to prioritizing junit test cases. IEEE Transactions on Software Engineering, 2012, 38(6): 1258–1275 https://doi.org/10.1109/TSE.2011.106
37
J Chen, Y Bai, D Hao, Y Xiong, H Zhang, B Xie. Learning to prioritize test programs for compiler testing. In: Proceedings of the 39th International Conference on Software Engineering. 2017, 700–711 https://doi.org/10.1109/ICSE.2017.70
38
X Li, H Jiang, D Liu, Z Ren, G Li. Unsupervised deep bug report summarization. In: Proceedings of the 26th International Conference on Program Comprehension. 2018, 144–155 https://doi.org/10.1145/3196321.3196326
39
E Nagai, H Awazu, N Ishiura, N Takeda. Random testing of C compilers targeting arithmetic optimization. In: Proceedings of the Workshop on Synthesis and System Integration of Mixed Information Technologies. 2012, 48–53
40
V Garousi, A Mesbah, A Betin-Can, S Mirshokraie. A systematic mapping study of Web application testing. Information and Software Technology, 2013, 55(8): 1374–1396 https://doi.org/10.1016/j.infsof.2013.02.006
41
U Kanewala, J M Bieman. Testing scientific software: a systematic literature review. Information and Software Technology, 2014, 56(10): 1219–1232 https://doi.org/10.1016/j.infsof.2014.05.006
42
R Mihalcea, P Tarau. Textrank: bringing order into text. In: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing. 2004, 404–411
43
B Balcerzak, W Jaworski, A Wierzbicki. Application of TextRank algorithm for credibility assessment. In: Proceedings of the 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT). 2014, 451–454 https://doi.org/10.1109/WI-IAT.2014.70
44
M M Rahman, C K Roy. TextRank based search term identification for software change tasks. In: Proceedings of the 22nd IEEE International Conference on Software Analysis, Evolution and Reengineering. 2015, 540–544
45
C W Holsapple, L E Johnson, H Manakyan, J Tanner. Business computing research journals: a normalized citation analysis. Journal of Management Information Systems, 1994, 11(1): 131–140 https://doi.org/10.1080/07421222.1994.11518033
V D Blondel, J L Guillaume, R Lambiotte, E Lefebvre. Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008, 2008(10): 10008–10020 https://doi.org/10.1088/1742-5468/2008/10/P10008
48
R Agrawal, R Srikant. Fast algorithms for mining association rules. In: Proceedings of the 20th International Conference on Very Large Data Bases. 1994, 487–499
49
M Bastian, S Heymann, M Jacomy. Gephi: an open source software for exploring and manipulating networks. In: Proceedings of International Conference on Weblogs and Social Media. 2009, 361–362
50
M Jacomy, T Venturini, S Heymann, M Bastian. ForceAtlas2, a continuous graph layout algorithm for handy network visualization designed for the Gephi software. Public Library of Science One, 2014, 9(6): e98679 https://doi.org/10.1371/journal.pone.0098679
51
H N Su, P C Lee. Mapping knowledge structure by keyword cooccurrence: a first look at journal papers in technology foresight. Scientometrics, 2010, 85(1): 65–79 https://doi.org/10.1007/s11192-010-0259-8
52
H Mei, L Zhang. Can big data bring a breakthrough for software automation. Science China (Information Sciences), 2018, 61(5): 056101 https://doi.org/10.1007/s11432-017-9355-3
53
C Lattner, V Adve. LLVM: a compilation framework for lifelong program analysis and transformation. In: Proceedings of the International Symposium on Code Generation and Optimization: Feedback-directed and Runtime Optimization. 2004, 75–86 https://doi.org/10.1109/CGO.2004.1281665
54
S Zelenov, S Zelenova. Model-based testing of optimizing compilers. In: Proceedings of the International Conference on Testing of Software and Communicating Systems. 2007, 365–377 https://doi.org/10.1007/978-3-540-73066-8_25
55
J Chen, Y Bai, D Hao, Y Xiong, H Zhang, L Zhang, B Xie. Test case prioritization for compilers: a text-vector based approach. In: Proceedings of 2016 IEEE International Conference on Software Testing, Verification and Validation. 2016, 266–277 https://doi.org/10.1109/ICST.2016.19
56
G Woo, H S Chae, H Jang. An intermediate representation approach to reducing test suites for retargeted compilers. In: Proceedings of the International Conference on Reliable Software Technologies. 2007, 100–113 https://doi.org/10.1007/978-3-540-73230-3_8
57
C Wohlin. An analysis of the most cited articles in software engineering journals — 1999. Information and Software Technology, 2005, 47(15): 957–964 https://doi.org/10.1016/j.infsof.2005.09.002
58
C Wohlin. An analysis of the most cited articles in software engineering journals — 2000. Information and Software Technology, 2007, 49(1): 2–11 https://doi.org/10.1016/j.infsof.2006.08.004
59
C Wohlin. An analysis of the most cited articles in software engineering journals — 2001. Information and Software Technology, 2008, 50(1–2): 3–9 https://doi.org/10.1016/j.infsof.2007.10.002
60
C Wohlin. An analysis of the most cited articles in software engineering journals — 2002. Information and Software Technology, 2009, 50(1): 3–6 https://doi.org/10.1016/j.infsof.2008.09.012
61
W E Wong, T H Tse, R L Glass, V R Basili, T Y Chen. An assessment of systems and software engineering scholars and institutions (2001— 2005). Journal of Systems and Software, 2008, 81(6): 1059–1062 https://doi.org/10.1016/j.jss.2007.09.018
62
W E Wong, T H Tse, R L Glass, V R Basili, T Y Chen. An assessment of systems and software engineering scholars and institutions (2002— 2006). Journal of Systems and Software, 2009, 82(8): 1370–1373 https://doi.org/10.1016/j.jss.2009.06.018
63
W E Wong, T H Tse, R L Glass, V R Basili, T Y Chen. An assessment of systems and software engineering scholars and institutions (2003— 2007 and 2004—2008). Journal of Systems and Software, 2011, 84(1): 162–168 https://doi.org/10.1016/j.jss.2010.09.036
64
F G Freitas, J T Souza. Ten years of search based software engineering: a bibliometric analysis. In: Proceedings of the International Symposium on Search Based Software Engineering. 2011, 18–32 https://doi.org/10.1007/978-3-642-23716-4_5
65
H Jiang, X Chen, J Zhang, X Han, X Xu. Mining software repositories: contributors and hot topics. Journal of Computer Research and Development, 2016, 53(12): 2768–2782
66
V Garousi, G Ruhe. A bibliometric/geographic assessment of 40 years of software engineering research (1969—2009). International Journal of Software Engineering and Knowledge Engineering, 2013, 23(9): 1343–1366 https://doi.org/10.1142/S0218194013500423
67
V Garousi , J M Fernandes. Highly-cited papers in software engineering: the top-100. Information and Software Technology, 2016, 71(3): 108–128 https://doi.org/10.1016/j.infsof.2015.11.003
68
T Velden, A Haque, C Lagoze. A new approach to analyzing patterns of collaboration in co-authorship networks: mesoscopic analysis and interpretation. Scientometrics, 2010, 85(1): 219–242 https://doi.org/10.1007/s11192-010-0224-6
69
G Madaan, S Jolad. Evolution of scientific collaboration networks. In: Proceedings of 2014 IEEE International Conference on Big Data. 2014, 7–13 https://doi.org/10.1109/BigData.2014.7004346