Please wait a minute...
Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

Postal Subscription Code 80-970

2018 Impact Factor: 1.129

Front. Comput. Sci.    2024, Vol. 18 Issue (2) : 182202    https://doi.org/10.1007/s11704-023-2521-2
Software
ContextAug: model-domain failing test augmentation with contextual information
Zhuo ZHANG1, Jianxin XUE2(), Deheng YANG3, Xiaoguang MAO3
1. School of Information Technology and Engineering, Guangzhou College of Commerce, Guangzhou 511363, China
2. School of Computer and Information Engineering, Institute for Artificial Intelligence, Shanghai Polytechnic University, Shanghai 201209, China
3. College of Computer, National University of Defense Technology, Changsha 410073, China
 Download: PDF(10959 KB)   HTML
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

In the process of software development, the ability to localize faults is crucial for improving the efficiency of debugging. Generally speaking, detecting and repairing errant behavior at an early stage of the development cycle considerably reduces costs and development time. Researchers have tried to utilize various methods to locate the faulty codes. However, failing test cases usually account for a small portion of the test suite, which inevitably leads to the class-imbalance phenomenon and hampers the effectiveness of fault localization.

Accordingly, in this work, we propose a new fault localization approach named ContextAug. After obtaining dynamic execution through test cases, ContextAug traces these executions to build an information model; subsequently, it constructs a failure context with propagation dependencies to intersect with new model-domain failing test samples synthesized by the minimum variability of the minority feature space. In contrast to traditional test generation directly from the input domain, ContextAug seeks a new perspective to synthesize failing test samples from the model domain, which is much easier to augment test suites. Through conducting empirical research on real large-sized programs with 13 state-of-the-art fault localization approaches, ContextAug could significantly improve fault localization effectiveness with up to 54.53%. Thus, ContextAug is verified as able to improve fault localization effectiveness.

Keywords context      fault localization      test cases     
Corresponding Author(s): Jianxin XUE   
Just Accepted Date: 30 December 2022   Issue Date: 23 March 2023
 Cite this article:   
Zhuo ZHANG,Jianxin XUE,Deheng YANG, et al. ContextAug: model-domain failing test augmentation with contextual information[J]. Front. Comput. Sci., 2024, 18(2): 182202.
 URL:  
https://academic.hep.com.cn/fcs/EN/10.1007/s11704-023-2521-2
https://academic.hep.com.cn/fcs/EN/Y2024/V18/I2/182202
Fig.1  The general process of typical program fault localization techniques
Fig.2  The definition of the information model
Name Formulas Name Formulas
ER1’ Naish1 { 1if ane>0 a npifane 0 GP02 2 (aef+ anp)+ aep
Optimal_P aef aepaep+ an p+1 GP03 |a ef 2aep|
GP13 aef(1+aep2aep+aef) GP19 aef| aep aef+anf anp|
ER5 Wong1 aef Dstar aefa nf+aep
Russel_Rao aef aef+anf+ ae p+anp
Binary { 0 ,if ane>0 1,ifane 0 Ochiai aef(aef +a nf)( aef+aep)
Tab.1  Typical Formulas of SBFL
  
  
  
Fig.3  An example illustrating model-domain failing test augmentation of ContextAug
Fig.4  Virtual tests
Program Description Versions KLOC Test Coverage
chart JFreeChart 26 96 2205 59%
math Apache Commons Math 106 85 3602 78%
lang Apache commons-lang 65 22 2245 26%
closure Closure Compiler 133 90 7927 77%
mockito Framework for unit tests 38 6 1075 71%
time Joda-Time 27 53 4130 75%
python General-purpose language 8 407 355 16%
gzip Data compression 5 491 12 39%
libtiff Image processing 12 77 78 59%
space ADL interpreter 38 6.1 13585 100%
nanoxml_v1 XML parser 7 5.4 206 76%
nanoxml_v2 XML parser 7 5.7 206 72%
nanoxml_v3 XML parser 10 8.4 206 78%
nanoxml_v5 XML parser 7 8.8 206 78%
spoon Java code analysis and transformation 31 76 1114 63%
jackson-databind General data binding 13 99 1711 71%
debezium Platform for change data capture 4 53 508 69%
Tab.2  Summary of subject programs
Comparison Top-1/% Top-5/% Top-10/% Top-20/% MFR MAR
MLP-FL 0 1.9 2.7 11.3 232 361
MLP-FL(ContextAug) 1.9 5.4 11.3 20.6 193 336
CNN-FL 1.4 2.7 6.9 16.7 184 279
CNN-FL(ContextAug) 2.3 10.2 15.7 24.8 131 258
BiLSTM-FL 0 1.4 2.7 10.2 246 431
BiLSTM-FL(ContextAug) 1.4 3.8 9.5 16.4 221 379
DeepFL 1.4 3.8 9.1 20.3 197 296
DeepFL(ContextAug) 1.9 10.2 16.7 25.8 141 259
FLUCCS 0 5.8 8.6 16.4 212 335
FLUCCS(ContextAug) 1.4 9.8 15.7 24.3 157 291
DEEPRL4FL 3.8 11.3 24.3 40.8 137 242
DEEPRL4FL(ContextAug) 3.8 12.6 25.8 46.7 121 225
ER5 0 5.8 11.6 19.3 252 447
ER5(ContextAug) 1.4 7.4 13.1 27.2 224 393
GP02 1.4 14.2 19.8 35.9 282 561
GP02(ContextAug) 2.3 16.7 22.5 40.8 231 486
GP03 2.7 10.2 10.7 16.4 252 398
GP03(ContextAug) 2.7 15.1 17.5 20.6 215 326
Dstar 2.7 17.1 24.8 38.6 258 372
Dstar(ContextAug) 3.2 21.7 31.9 45.7 231 325
ER1’ 2.7 18.2 24.3 36.5 257 396
ER1’(ContextAug) 3.2 21.7 28.8 41.7 235 322
GP19 2.7 8.6 14.8 22.5 268 413
GP19(ContextAug) 3.2 11.6 21.7 27.9 234 385
Ochiai 1.4 18.9 23.5 33.4 226 381
Ochiai(ContextAug) 2.3 22.5 28.2 39.6 193 262
Tab.3  Top-N, MAR and MFR comparison of ContextAug
Fig.5  RImp comparison of ContextAug over fault localization approaches and subjects. (a) RImp on fault localization approaches; (b) RImp on subject programs
Comparison on fault localization approaches Result Comparison on fault localization approaches Result
MLP-FL(ContextAug) vs MLP-FL BETTER 17(100%) CNN-FL(ContextAug) vs CNN-FL BETTER 17(100%)
SIMILAR 0(0%) SIMILAR 0(0%)
WORSE 0(0%) WORSE 0(0%)
BiLSTM-FL(ContextAug) vs BiLSTM-FL BETTER 17(100%) DeepFL(ContextAug) vs DeepFL BETTER 14(82.4%)
SIMILAR 0(0%) SIMILAR 3(17.6%)
WORSE 0(0%) WORSE 0(0%)
FLUCCS(ContextAug) vs FLUCCS BETTER 13(76.5%) DeepRL4FL(ContextAug) vs DeepRL4FL BETTER 12(70.6%)
SIMILAR 4(23.5%) SIMILAR 5(29.4%)
WORSE 0(0%) WORSE 0(0%)
Ochiai(ContextAug) vs Ochiai BETTER 9(52.9%) ER5(ContextAug) vs ER5 BETTER 10(58.7%)
SIMILAR 8(47.1%) SIMILAR 7(41.3%)
WORSE 0(0%) WORSE 0(0%)
GP02(ContextAug) vs GP02 BETTER 9(52.9%) GP03(ContextAug) vs GP03 BETTER 11(64.7%)
SIMILAR 8(47.1%) SIMILAR 6(35.3%)
WORSE 0(0%) WORSE 0(0%)
Dstar(ContextAug) vs Dstar BETTER 9(52.9%) ER1'(ContextAug) vs ER1' BETTER 8(47.1%)
SIMILAR 8(47.1%) SIMILAR 9(52.9%)
WORSE 0(0%) WORSE 0(0%)
GP19(ContextAug) vs GP19 BETTER 11(64.7%)
SIMILAR 6(35.3%)
WORSE 0(0%)
Comparison on subject programs Result Comparison on subject programs Result
gzip(ContextAug) vs gzip BETTER 7(53.8%) libtiff(ContextAug) vs libtiff BETTER 7(53.8%)
SIMILAR 6(46.2%) SIMILAR 6(46.2%)
WORSE 0(0%) WORSE 0(0%)
lang(ContextAug) vs lang BETTER 5(38.5%) closure(ContextAug) vs closure BETTER 6(46.2%)
SIMILAR 8(61.5%) SIMILAR 7(53.8%)
WORSE 0(15.4%) WORSE 0(0%)
python(ContextAug) vs python BETTER 4(30.8%) space(ContextAug) vs space BETTER 9(69.2%)
SIMILAR 9(69.2%) SIMILAR 4(30.8%)
WORSE 0(0%) WORSE 0(0%)
chart(ContextAug) vs chart BETTER 8(61.5%) math(ContextAug) vs math BETTER 11(84.6%)
SIMILAR 5(38.5%) SIMILAR 2(15.4%)
WORSE 0(0%) WORSE 0(0%)
mockito(ContextAug) vs mockito BETTER 11(84.6%) time(ContextAug) vs time BETTER 8(61.5%)
SIMILAR 2(15.4%) SIMILAR 5(38.5%)
WORSE 0(0%) WORSE 0(0%)
nanoxml_v1(ContextAug) vs nanoxml_v1 BETTER 9(69.2%) nanoxml_v2(ContextAug) vs nanoxml_v2 BETTER 10(76.9%)
SIMILAR 4(30.8%) SIMILAR 3(23.1%)
WORSE 0(0%) WORSE 0(0%)
nanoxml_v3(ContextAug) vs nanoxml_v3 BETTER 8(61.5%) nanoxml_v5(ContextAug) vs nanoxml_v5 BETTER 10(76.9%)
SIMILAR 5(38.5%) SIMILAR 3(23.1%)
WORSE 0(0%) WORSE 0(0%)
spoon(ContextAug) vs spoon BETTER 8(61.5%) jackson-databind(ContextAug) vs jackson-databind BETTER 7(53.8%)
SIMILAR 5(38.5%) SIMILAR 6(46.2%)
WORSE 0(0%) WORSE 0(0%)
debezium(ContextAug) vs debezium BETTER 9(69.2%)
SIMILAR 4(30.8%)
WORSE 0(0%)
Tab.4  Wilcoxon-Signed-Rank test of ContextAug
Comparison on fault localization approaches Result Comparison on fault localization approaches Result
MLP-FL(ContextAug)vs MLP-FL(undersampling) BETTER 17(100%) CNN-FL(ContextAug) vs CNN-FL(undersampling) BETTER 17(100%)
SIMILAR 0(0%) SIMILAR 0(0%)
WORSE 0(0%) WORSE 0(0%)
BiLSTM-FL(ContextAug)vs BiLSTM-FL(undersampling) BETTER 17(100%) DeepFL(ContextAug) vs DeepFL(undersampling) BETTER 17(100%)
SIMILAR 0(0%) SIMILAR 0(0%)
WORSE 0(0%) WORSE 0(0%)
FLUCCS(ContextAug) vs FLUCCS(undersampling) BETTER 17(100%) DeepRL4FL(ContextAug) vs DeepRL4FL(undersampling) BETTER 14(82.4%)
SIMILAR 0(0%) SIMILAR 3(17.6%)
WORSE 0(0%) WORSE 0(0%)
Ochiai(ContextAug)vs Ochiai(undersampling) BETTER 13(76.5%) ER5(ContextAug) vs ER5(undersampling) BETTER 11(64.7%)
SIMILAR 4(23.5%) SIMILAR 6(35.3%)
WORSE 0(0%) WORSE 0(0%)
GP02(ContextAug)vs GP02(undersampling) BETTER 14(82.4%) GP03(ContextAug) vs GP03(undersampling) BETTER 13(76.5%)
SIMILAR 3(17.6%) SIMILAR 4(23.5%)
WORSE 0(0%) WORSE 0(0%)
Dstar(ContextAug)vs Dstar(undersampling) BETTER 10(58.8%) ER1'(ContextAug) vs ER1'(undersampling) BETTER 9(52.9%)
SIMILAR 7(41.2%) SIMILAR 8(47.1%)
WORSE 0(0%) WORSE 0(0%)
GP19(ContextAug)vs GP19(undersampling) BETTER 12(70.6%)
SIMILAR 5(29.4%)
WORSE 0(0%)
Comparison on subject programs Result Comparison on subject programs Result
gzip(ContextAug) vs gzip(undersampling) BETTER 10(76.9%) libtiff(ContextAug) vs libtiff(undersampling) BETTER 7(53.8%)
SIMILAR 3(23.1%) SIMILAR 6(46.2%)
WORSE 0(0%) WORSE 0(0%)
lang(ContextAug) vs lang(undersampling) BETTER 10(76.9%) closure(ContextAug) vs closure(undersampling) BETTER 9(69.2%)
SIMILAR 3(23.1%) SIMILAR 4(30.8%)
WORSE 0(15.4%) WORSE 0(0%)
python(ContextAug) vs python(undersampling) BETTER 9(69.2%) space(ContextAug) vs space(undersampling) BETTER 9(69.2%)
SIMILAR 4(30.8%) SIMILAR 4(30.8%)
WORSE 0(0%) WORSE 0(0%)
chart(ContextAug) vs chart(undersampling) BETTER 13(100%) math(ContextAug) vs math(undersampling) BETTER 9(69.2%)
SIMILAR 0(0%) SIMILAR 4(30.8%)
WORSE 0(0%) WORSE 0(0%)
mockito(ContextAug) vs mockito(undersampling) BETTER 12(92.3%) time(ContextAug) vs time(undersampling) BETTER 10(76.9%)
SIMILAR 1(7.7%) SIMILAR 3(23.1%)
WORSE 0(0%) WORSE 0(0%)
nanoxml_v1(ContextAug) vs nanoxml_v1(undersampling) BETTER 9(69.2%) nanoxml_v2(ContextAug) vs nanoxml_v2(undersampling) BETTER 9(69.2%)
SIMILAR 4(30.8%) SIMILAR 3(23.1%)
WORSE 0(0%) WORSE 1(7.7%)
nanoxml_v3(ContextAug) vs nanoxml_v3(undersampling) BETTER 8(61.5%) nanoxml_v5(ContextAug) vs nanoxml_v5(undersampling) BETTER 9(69.2%)
SIMILAR 5(38.5%) SIMILAR 4(30.8%)
WORSE 0(0%) WORSE 0(0%)
spoon(ContextAug) vs spoon BETTER 9(69.2%) jackson-databind(ContextAug) vs jackson-databind BETTER 10(76.9%)
SIMILAR 4(30.8%) SIMILAR 3(23.1%)
WORSE 0(0%) WORSE 0(0%)
debezium(ContextAug) vs debezium BETTER 10(76.9%)
SIMILAR 3(23.1%)
WORSE 0(0%)
Tab.5  Wilcoxon-Signed-Rank test of ContextAug over undersampling
Comparison on fault localization approaches Result Comparison on fault localization approaches Result
MLP-FL(ContextAug) vs MLP-FL(smote) BETTER 9(52.9%) CNN-FL(ContextAug) vs CNN-FL(smote) BETTER 6(35.3%)
SIMILAR 7(41.2%) SIMILAR 9(52.9%)
WORSE 1(5.9%) WORSE 2(11.8%)
BiLSTM-FL(ContextAug) vs BiLSTM-FL(smote) BETTER 10(58.8%) DeepFL(ContextAug) vs DeepFL(smote) BETTER 6(35.3%)
SIMILAR 6(35.3%) SIMILAR 9(52.9%)
WORSE 1(5.9%) WORSE 2(11.8%)
FLUCCS(ContextAug) vs FLUCCS(smote) BETTER 7(41.2%) DeepRL4FL(ContextAug) vs DeepRL4FL(smote) BETTER 7(41.2%)
SIMILAR 9(52.9%) SIMILAR 9(52.9%)
WORSE 1(5.9%) WORSE 1(5.9%)
Ochiai(ContextAug) vs Ochiai(smote) BETTER 5(29.4%) ER5(ContextAug) vs ER5(smote) BETTER 8(47.1%)
SIMILAR 11(64.7%) SIMILAR 8(47.1%)
WORSE 1(5.9%) WORSE 1(5.9%)
GP02(ContextAug) vs GP02(smote) BETTER 7(41.2%) GP03(ContextAug) vs GP03(smote) BETTER 6(35.3%)
SIMILAR 10(58.8%) SIMILAR 9(52.9%)
WORSE 0(0%) WORSE 2(11.8%)
Dstar(ContextAug) vs Dstar(smote) BETTER 6(35.3%) ER1'(ContextAug) vs ER1'(smote) BETTER 6(35.3%)
SIMILAR 10(58.8%) SIMILAR 11(64.7%)
WORSE 1(5.9%) WORSE 0(0%)
GP19(ContextAug) vs GP19(smote) BETTER 7(41.2%)
SIMILAR 8(47.1%)
WORSE 2(11.8%)
Comparison on subject programs Result Comparison on subject programs Result
gzip(ContextAug) vs gzip(smote) BETTER 4(30.8%) libtiff(ContextAug) vs libtiff(smote) BETTER 4(30.8%)
SIMILAR 8(61.5%) SIMILAR 8(61.5%)
WORSE 1(7.7%) WORSE 1(7.7%)
lang(ContextAug) vs lang(smote) BETTER 4(30.8%) closure(ContextAug) vs closure(smote) BETTER 5(38.5%)
SIMILAR 8(61.5%) SIMILAR 6(46.2%)
WORSE 1(7.7%) WORSE 2(15.4%)
python(ContextAug) vs python(smote) BETTER 3(23.1%) space(ContextAug) vs space(smote) BETTER 3(23.1%)
SIMILAR 9(69.2%) SIMILAR 9(69.2%)
WORSE 1(7.7%) WORSE 1(7.7%)
chart(ContextAug) vs chart(smote) BETTER 4(30.8%) math(ContextAug) vs math(smote) BETTER 5(38.5%)
SIMILAR 7(53.8%) SIMILAR 7(53.8%)
WORSE 2(15.4%) WORSE 1(7.7%)
mockito(ContextAug) vs mockito(smote) BETTER 4(30.8%) time(ContextAug) vs time(smote) BETTER 5(38.5%)
SIMILAR 9(69.2%) SIMILAR 8(61.5%)
WORSE 0(0%) WORSE 0(0%)
nanoxml_v1(ContextAug) vs nanoxml_v1(smote) BETTER 4(30.8%) nanoxml_v2(ContextAug) vs nanoxml_v2(smote) BETTER 6(46.2%)
SIMILAR 8(61.5%) SIMILAR 6(46.2%)
WORSE 1(7.7%) WORSE 1(7.7%)
nanoxml_v3(ContextAug) vs nanoxml_v3(smote) BETTER 5(38.5%) nanoxml_v5(ContextAug) vs nanoxml_v5(smote) BETTER 5(38.5%)
SIMILAR 7(53.8%) SIMILAR 7(53.8%)
WORSE 1(7.7%) WORSE 1(7.7%)
spoon(ContextAug) vs spoon BETTER 5(38.5%) jackson-databind(ContextAug) vs jackson-databind BETTER 5(38.5%)
SIMILAR 8(61.5%) SIMILAR 6(46.2%)
WORSE 0(0%) WORSE 2(15.4%)
debezium(ContextAug) vs debezium BETTER 6(46.2%)
SIMILAR 7(53.8%)
WORSE 0(0%)
Tab.6  Wilcoxon-Signed-Rank test of ContextAug over smote
Comparison on fault localization approaches Result Comparison on fault localization approaches Result
MLP-FL(ContextAug) vs MLP-FL(ContextAug(rmss)) BETTER 17(100%) CNN-FL(ContextAug) vs CNN-FL(ContextAug(rmss)) BETTER 17(100%)
SIMILAR 0(0%) SIMILAR 0(0%)
WORSE 0(0%) WORSE 0(0%)
BiLSTM-FL(ContextAug) vs BiLSTM-FL(ContextAug(rmss)) BETTER 17(100%) DeepFL(ContextAug) vs DeepFL(ContextAug(rmss)) BETTER 17(100%)
SIMILAR 0(0%) SIMILAR 0(0%)
WORSE 0(0%) WORSE 0(0%)
FLUCCS(ContextAug) vs FLUCCS(ContextAug(rmss)) BETTER 17(100%) DEEPRL4FL(ContextAug) vs DEEPRL4FL(ContextAug(rmss)) BETTER 17(100%)
SIMILAR 0(0%) SIMILAR 0(0%)
WORSE 0(0%) WORSE 0(0%)
Ochiai(ContextAug) vs Ochiai(ContextAug(rmss)) BETTER 17(100%) ER5(ContextAug) vs ER5(ContextAug(rmss)) BETTER 17(100%)
SIMILAR 0(0%) SIMILAR 0(0%)
WORSE 0(0%) WORSE 0(0%)
GP02(ContextAug) vs GP02(ContextAug(rmss)) BETTER 17(100%) GP03(ContextAug) vs GP03(ContextAug(rmss)) BETTER 17(100%)
SIMILAR 0(0%) SIMILAR 0(0%)
WORSE 0(0%) WORSE 0(0%)
Dstar(ContextAug) vs Dstar(ContextAug(rmss)) BETTER 17(100%) ER1'(ContextAug) vs ER1'(ContextAug(rmss)) BETTER 17(100%)
SIMILAR 0(0%) SIMILAR 0(0%)
WORSE 0(0%) WORSE 0(0%)
GP19(ContextAug) vs GP19(ContextAug(rmss)) BETTER 17(100%)
SIMILAR 0(0%)
WORSE 0(0%)
Comparison on subject programs Result Comparison on subject programs Result
gzip(ContextAug) vs gzip(ContextAug(rmss)) BETTER 13(100%) libtiff(ContextAug) vs libtiff(ContextAug(rmss)) BETTER 13(100%)
SIMILAR 0(0%) SIMILAR 0(0%)
WORSE 0(0%) WORSE 0(0%)
lang(ContextAug) vs lang(ContextAug(rmss)) BETTER 13(100%) closure(ContextAug) vs closure(ContextAug(rmss)) BETTER 13(100%)
SIMILAR 0(0%) SIMILAR 0(0%)
WORSE 0(0%) WORSE 0(0%)
python(ContextAug) vs python(ContextAug(rmss)) BETTER 13(100%) space(ContextAug) vs space(ContextAug(rmss)) BETTER 13(100%)
SIMILAR 0(0%) SIMILAR 0(0%)
WORSE 0(0%) WORSE 0(0%)
chart(ContextAug) vs chart(ContextAug(rmss)) BETTER 13(100%) math(ContextAug) vs math(ContextAug(rmss)) BETTER 13(100%)
SIMILAR 0(0%) SIMILAR 0(0%)
WORSE 0(0%) WORSE 0(0%)
mockito(ContextAug) vs mockito(ContextAug(rmss)) BETTER 13(100%) time(ContextAug) vs time(ContextAug(rmss)) BETTER 13(100%)
SIMILAR 0(0%) SIMILAR 0(0%)
WORSE 0(0%) WORSE 0(0%)
nanoxml_v1(ContextAug) vs nanoxml_v1(ContextAug(rmss)) BETTER 13(100%) nanoxml_v2(ContextAug) vs nanoxml_v2(ContextAug(rmss)) BETTER 13(100%)
SIMILAR 0(0%) SIMILAR 0(0%)
WORSE 0(0%) WORSE 0(0%)
nanoxml_v3(ContextAug) vs nanoxml_v3(ContextAug(rmss)) BETTER 13(100%) nanoxml_v5(ContextAug) vs nanoxml_v5(ContextAug(rmss)) BETTER 13(100%)
SIMILAR 0(0%) SIMILAR 0(0%)
WORSE 0(0%) WORSE 0(0%)
spoon(ContextAug) vs spoon(ContextAug(rmss)) BETTER 13(100%) jackson-databind(ContextAug) vs jackson-databind(ContextAug(rmss)) BETTER 13(100%)
SIMILAR 0(0%) SIMILAR 0(0%)
WORSE 0(0%) WORSE 0(0%)
debezium(ContextAug) vs debezium(ContextAug(rmss)) BETTER 13(100%)
SIMILAR 0(0%)
WORSE 0(0%)
Tab.7  Discussion of ContextAug without covering the faulty statement
  
  
  
  
1 W E, Wong R, Gao Y, Li R, Abreu F Wotawa . A survey on software fault localization. IEEE Transactions on Software Engineering, 2016, 42( 8): 707–740
2 S, Pearson J, Campos R, Just G, Fraser R, Abreu M D, Ernst D, Pang B Keller . Evaluating and improving fault localization. In: Proceedings of the 39th IEEE/ACM International Conference on Software Engineering. 2017, 609−620
3 X, Xie T Y, Chen F C, Kuo B Xu . A theoretical analysis of the risk evaluation formulas for spectrum-based fault localization. ACM Transactions on Software Engineering and Methodology, 2013, 22( 4): 31
4 L, Naish H J, Lee K Ramamohanarao . A model for spectra-based software diagnosis. ACM Transactions on Software Engineering and Methodology, 2011, 20( 3): 11
5 Z, Zhang Y, Lei X, Mao P Li . CNN-FL: an effective approach for localizing faults using convolutional neural networks. In: Proceedings of the 26th IEEE International Conference on Software Analysis, Evolution and Reengineering. 2019, 445−455
6 Z, Zhang Y, Lei X, Mao M, Yan L, Xu J Wen . Improving deep-learning-based fault localization with resampling. Journal of Software: Evolution and Process, 2021, 33( 3): e2312
7 X, Li W, Li Y, Zhang L Zhang . DeepFL: integrating multiple fault diagnosis dimensions for deep fault localization. In: Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis. 2019, 169−180
8 J, Sohn S Yoo . FLUCCS: using code and change metrics to improve fault localization. In: Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis. 2017, 273−283
9 H J, Lee L, Naish K Ramamohanarao . Effective software bug localization using spectral frequency weighting function. In: Proceedings of the 34th IEEE Annual Computer Software and Applications Conference. 2010, 218−227
10 Y, Lei X, Mao M, Zhang J, Ren Y Jiang . Toward understanding information models of fault localization: elaborate is not always better. In: Proceedings of the 41st IEEE Annual Computer Software and Applications Conference. 2017, 57−66
11 G, Cheng Z, Zheng L, Wei P Hao . Effects of class imbalance in test suites: an empirical study of spectrum-based fault localization. In: Proceedings of the 36th IEEE Annual Computer Software and Applications Conference Workshops. 2012, 470−475
12 L, Zhang L, Yan Z, Zhang J, Zhang W K, Chan Z Zheng . A theoretical analysis on cloning the failed test cases to improve spectrum-based fault localization. Journal of Systems and Software, 2017, 129: 35–57
13 W, Jin A Orso . F3: fault localization for field failures. In: Proceedings of 2013 International Symposium on Software Testing and Analysis. 2013, 213−223
14 W, Jin A Orso . BugRedux: reproducing field failures for in-house debugging. In: Proceedings of the 34th International Conference on Software Engineering. 2012, 474−484
15 M, Soltani P, Derakhshanfar A, Panichella X, Devroey A, Zaidman Deursen A van . Single-objective versus multi-objectivized optimization for evolutionary crash reproduction. In: Proceedings of the 10th International Symposium on Search Based Software Engineering. 2018, 325−340
16 M, Soltani P, Derakhshanfar X, Devroey Deursen A van . A benchmark-based evaluation of search-based crash reproduction. Empirical Software Engineering, 2020, 25( 1): 96–138
17 M, Böhme C, Geethal V T Pham . Human-in-the-loop automatic program repair. In: Proceedings of the 13th IEEE International Conference on Software Testing, Validation and Verification. 2020, 274−285
18 G, An S Yoo . Human-in-the-loop fault localisation using efficient test prioritisation of generated tests. 2021, arXiv preprint arXiv: 2104.06641
19 B, Baudry F, Fleurey Traon Y Le . Improving test suites for efficient fault localization. In: Proceedings of the 28th International Conference on Software Engineering. 2006, 82−91
20 D, Hao Y, Pan L, Zhang W, Zhao H, Mei J Sun . A similarity-aware approach to testing based fault localization. In: Proceedings of the 20th IEEE/ACM International Conference on Automated software Engineering. 2005, 291−294
21 Y, Lei C, Sun X, Mao Z Su . How test suites impact fault localisation starting from the size. IET Software, 2018, 12( 3): 190–205
22 H, He E A Garcia . Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering, 2009, 21( 9): 1263–1284
23 B Krawczyk . Learning from imbalanced data: open challenges and future directions. Progress in Artificial Intelligence, 2016, 5( 4): 221–232
24 C, Shorten T M Khoshgoftaar . A survey on image data augmentation for deep learning. Journal of Big Data, 2019, 6( 1): 60
25 Y, Xian T, Lorenz B, Schiele Z Akata . Feature generating networks for zero-shot learning. In: Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 5542−5551
26 Y, Xian S, Sharma B, Schiele Z Akata . F-VAEGAN-D2: a feature generating framework for any-shot learning. In: Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, 10276−10276
27 F, Zhou S, Huang Y Xing . Deep semantic dictionary learning for multi-label image classification. In: Proceedings of the 35th AAAI Conference on Artificial Intelligence. 2021, 3572−3580
28 C, Tantithamthavorn A E, Hassan K Matsumoto . The impact of class rebalancing techniques on the performance and interpretation of defect prediction models. IEEE Transactions on Software Engineering, 2020, 46( 11): 1200–1219
29 H, Agrawal J R Horgan . Dynamic program slicing. ACM SIGPLAN Notices, 1990, 25( 6): 246–256
30 B, Xu J, Qian X, Zhang Z, Wu L Chen . A brief survey of program slicing. ACM SIGSOFT Software Engineering Notes, 2005, 30( 2): 1–36
31 Z, Zhang Y, Lei X, Mao M, Yan L, Xu X Zhang . A study of effectiveness of deep learning in locating real faults. Information and Software Technology, 2021, 131: 106486
32 H, Wang B, Du J, He Y, Liu X Chen . IETCR: an information entropy based test case reduction strategy for mutation-based fault localization. IEEE Access, 2020, 8: 124297–124310
33 Z, Zhang Y, Lei X, Mao M, Yan X Xia . Improving fault localization using model-domain synthesized failing test generation. In: Proceedings of 2022 IEEE International Conference on Software Maintenance and Evolution. 2022, 199−210
34 X, Xie F C, Kuo T, Chen S, Yoo M Harman . Provably optimal and human-competitive results in SBSE for spectrum based fault localisation. In: Proceedings of the 5th International Symposium on Search Based Software Engineering. 2013, 224−238
35 N V, Chawla K W, Bowyer L O, Hall W P Kegelmeyer . SMOTE: synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 2002, 16: 321–357
36 R, Just D, Jalali M D Ernst . Defects4J: a database of existing faults to enable controlled testing studies for Java programs. In: Proceedings of 2014 International Symposium on Software Testing and Analysis. 2014, 437−440
37 Y, Li S, Wang T Nguyen . Fault localization with code coverage representation learning. In: Proceedings of the 43rd IEEE/ACM International Conference on Software Engineering. 2021, 661−673
38 C, Parnin A Orso . Are automated debugging techniques actually helping programmers? In: Proceedings of 2011 International Symposium on Software Testing and Analysis. 2011, 199−209
39 V, Debroy W E, Wong X, Xu B Choi . A grouping-based strategy to improve the effectiveness of fault localization techniques. In: Proceedings of the 10th International Conference on Quality Software. 2010, 13−22
40 L C, Briand Y, Labiche X Liu . Using machine learning to support debugging with tarantula. In: Proceedings of the 18th IEEE International Symposium on Software Reliability. 2017, 137−146
41 Y, Lei X, Mao Z, Dai C Wang . Effective statistical fault localization using program slices. In: Proceedings of the 36th IEEE Annual Computer Software and Applications Conference. 2012, 1−10
42 A Richardson . Nonparametric statistics for non-statisticians: a step-by-step approach. International Statistical Review, 2010, 78( 3): 451–452
43 J A, Jones J F, Bowring M J Harrold . Debugging in parallel. In: Proceedings of 2007 International Symposium on Software Testing and Analysis. 2007, 16−26
44 E, Wong T, Wei Y, Qi L Zhao . A crosstab-based statistical method for effective fault localization. In: Proceedings of the 1st International Conference on Software Testing, Verification, and Validation. 2008, 42−51
45 N, Japkowicz S Stephen . The class imbalance problem: a systematic study. Intelligent Data Analysis, 2002, 6( 5): 429–449
46 Y, Yu J A, Jones M J Harrold . An empirical study of the effects of test-suite reduction on fault localization. In: Proceedings of the 30th International Conference on Software Engineering. 2008, 201−210
47 W E, Wong Y Qi . BP neural network-based effective fault localization. International Journal of Software Engineering and Knowledge Engineering, 2009, 19( 4): 573–597
48 W E, Wong V, Debroy R, Golden X, Xu B Thuraisingham . Effective software fault localization using an RBF neural network. IEEE Transactions on Reliability, 2012, 61( 1): 149–169
49 Z, Zhang Y, Lei Q, Tan X, Mao P, Zeng X Chang . Deep Learning-based fault localization with contextual information. IEICE Transactions on Information and Systems, 2017, E100.D( 12): 3027–3031
50 J, Troya S, Segura J A, Parejo A Ruiz-Cortés . Spectrum-based fault localization in model transformations. ACM Transactions on Software Engineering and Methodology, 2018, 27( 3): 13
51 M, Zhang Y, Li X, Li L, Chen Y, Zhang L, Zhang S Khurshid . An empirical study of boosting spectrum-based fault localization via PageRank. IEEE Transactions on Software Engineering, 2021, 47( 6): 1089–1113
52 J, Jiang R, Wang Y, Xiong X, Chen L Zhang . Combining spectrum-based fault localization and statistical debugging: an empirical study. In: Proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering. 2019, 502−514
53 M Y, Chen E, Kiciman E, Fratkin A, Fox E Brewer . Pinpoint: problem determination in large, dynamic internet services. In: Proceedings of International Conference on Dependable Systems and Networks. 2002, 595−604
54 J A Jones . Fault localization using visualization of test information. In: Proceedings of the 26th International Conference on Software Engineering. 2004, 54−56
55 R, Abreu P, Zoeteweij Gemund A J C van . An evaluation of similarity coefficients for software fault localization. In: Proceedings of the 12th Pacific Rim International Symposium on Dependable Computing. 2006, 39−46
56 W E, Wong Y, Qi L, Zhao K Y Cai . Effective fault localization using code coverage. In: Proceedings of the 31st Annual International Computer Software and Applications Conference. 2007, 449−456
57 W E, Wong V, Debroy B Choi . A family of code coverage-based heuristics for effective fault localization. Journal of Systems and Software, 2010, 83( 2): 188–208
58 W E, Wong V, Debroy Y, Li R Gao . Software fault localization using DStar (D*). In: Proceedings of the 6th IEEE International Conference on Software Security and Reliability. 2012, 21−30
[1] FCS-22521-OF-ZZ_suppl_1 Download
[1] Dawei YUAN, Xiao PENG, Zijie CHEN, Tao ZHANG, Ruijia LEI. Code context-based reviewer recommendation[J]. Front. Comput. Sci., 2025, 19(1): 191202-.
[2] Sheng XU, Peifeng LI, Qiaoming ZHU. Incorporating contextual evidence to improve implicit discourse relation recognition in Chinese[J]. Front. Comput. Sci., 2024, 18(3): 183312-.
[3] Jiancan WU, Xiangnan HE, Xiang WANG, Qifan WANG, Weijian CHEN, Jianxun LIAN, Xing XIE. Graph convolution machine for context-aware recommender system[J]. Front. Comput. Sci., 2022, 16(6): 166614-.
[4] Yuxin HUANG, Zhengtao YU, Yan XIANG, Zhiqiang YU, Junjun GUO. Exploiting comments information to improve legal public opinion news abstractive summarization[J]. Front. Comput. Sci., 2022, 16(6): 166333-.
[5] Huiyan XU, Zhongqing WANG, Yifei ZHANG, Xiaolan WENG, Zhijian WANG, Guodong ZHOU. Document structure model for survey generation using neural network[J]. Front. Comput. Sci., 2021, 15(4): 154325-.
[6] Deheng YANG, Yuhua QI, Xiaoguang MAO, Yan LEI. Evaluating the usage of fault localization in automated program repair: an empirical study[J]. Front. Comput. Sci., 2021, 15(1): 151202-.
[7] Ying LI, Xiangwei KONG, Haiyan FU, Qi TIAN. Contextual modeling on auxiliary points for robust image reranking[J]. Front. Comput. Sci., 2019, 13(5): 1010-1022.
[8] Farid FEYZI, Saeed PARSA. Inforence: effective fault localization based on information-theoretic analysis and statistical causal inference[J]. Front. Comput. Sci., 2019, 13(4): 735-759.
[9] Xuansong LI, Xianping TAO, Jian LU. Towards a programming framework for activity-oriented context-aware applications[J]. Front. Comput. Sci., 2017, 11(6): 987-1006.
[10] Xiaofang QI, Zhenliang JIANG. Precise slicing of interprocedural concurrent programs[J]. Front. Comput. Sci., 2017, 11(6): 971-986.
[11] Wayne Xin ZHAO, Chen LIU, Ji-Rong WEN, Xiaoming LI. Ranking and tagging bursty features in text streams with context language models[J]. Front. Comput. Sci., 2017, 11(5): 852-862.
[12] Yuefeng DU, Derong SHEN, Tiezheng NIE, Yue KOU, Ge YU. Discovering context-aware conditional functional dependencies[J]. Front. Comput. Sci., 2017, 11(4): 688-701.
[13] Xiaobing SUN,Xin PENG,Bin LI,Bixin LI,Wanzhi WEN. IPSETFUL: an iterative process of selecting test cases for effective fault localization by exploring concept lattice of program spectra[J]. Front. Comput. Sci., 2016, 10(5): 812-831.
[14] Chuantao YIN,Bingxue ZHANG,Betrand DAVID,Zhang XIONG. A hierarchical ontology context model for work-based learning[J]. Front. Comput. Sci., 2015, 9(3): 466-473.
[15] Lu LIU,Tao PENG. Clustering-based topical Web crawling using CFu-tree guided by link-context[J]. Front. Comput. Sci., 2014, 8(4): 581-595.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed