Blogs (1) >>
ICSE 2019
Sat 25 - Fri 31 May 2019 Montreal, QC, Canada
Fri 31 May 2019 12:00 - 12:10 at Laurier - Defect Prediction Chair(s): Burak Turhan

Software defect data sets are typically characterized by an unbalanced class distribution where the defective modules are fewer than the nondefective modules. Prediction performances of defect prediction models are detrimentally affected by the skewed distribution of the faulty minority modules in the data set since most algorithms assume both classes in the data set to be equally balanced. Resampling approaches address this concern by modifying the class distribution to balance the minority and majority class distribution. However, very little is known about the best distribution for attaining high performance especially in a more practical scenario. There are still inconclusive results pertaining to the suitable ratio of defect and clean instances (Pfp), the statistical and practical impacts of resampling approaches on prediction performance and the more stable resampling approach across several performance measures. To assess the impact of resampling approaches, we investigated the bias and effect of commonly used resampling approaches on prediction accuracy in software defect prediction. Analyses of six resampling approaches on 40 releases of 20 open-source projects across five performance measures and five imbalance rates were performed. The experimental results obtained indicate that there were statistical differences between the prediction results with and without resampling methods when evaluated with the geometric-mean, recall(pd), probability of false alarms(pf ) and balance performance measures. However, resampling methods could not improve the AUC values across all prediction models implying that resampling methods can help in defect classification but not defect prioritization. A stable Pfp rate was dependent on the performance measure used. Lower Pfp rates are required for lower pf values whilst higher Pfp values are required for higher pd values. Random Under-Sampling and Borderline-SMOTE proved to be the more stable resampling method across several performance measures among the studied resampling methods. Performance of resampling methods are dependent on the imbalance ratio, evaluation measure and to some extent the prediction model. Newer oversampling methods should aim at generating relevant and informative data samples and not just increasing the minority samples.

Fri 31 May

Displayed time zone: Eastern Time (US & Canada) change

11:00 - 12:30
11:00
10m
Talk
Perceptions, Expectations, and Challenges in Defect PredictionJournal-First
Journal-First Papers
Zhiyuan Wan Zhejiang University, Xin Xia Monash University, Ahmed E. Hassan Queen's University, David Lo Singapore Management University, Jianwei Yin , Xiaohu Yang
11:10
20m
Talk
Mining Software Defects: Should We Consider Affected Releases?Artifacts AvailableArtifacts Evaluated ReusableTechnical Track
Technical Track
Suraj Yatish The University of Adelaide, Jirayus Jiarpakdee Monash University, Patanamon Thongtanunam The University of Melbourne, Kla Tantithamthavorn Monash University, Australia
11:30
20m
Talk
Class Imbalance Evolution and Verification Latency in Just-in-Time Software Defect PredictionTechnical Track
Technical Track
George Cabral University of Birmingham, Leandro Minku , Emad Shihab Concordia University, Suhaib Mujahid Concordia University
11:50
10m
Talk
The Impact of Class Rebalancing Techniques on the Performance and Interpretation of Defect Prediction ModelsJournal-First
Journal-First Papers
Kla Tantithamthavorn Monash University, Australia, Ahmed E. Hassan Queen's University, Kenichi Matsumoto Nara Institute of Science and Technology
Pre-print
12:00
10m
Talk
On the Relative Value of Data Resampling Approaches for Software Defect PredictionJournal-First
Journal-First Papers
Kwabena E. Bennin Blekinge Institute of Technology, SERL Sweden, Jacky Keung , Akito Monden
Authorizer link
12:10
10m
Talk
Energy-Based Anomaly Detection A New Perspective for Predicting Software FailuresNIER Distinguished Paper AwardNIER
New Ideas and Emerging Results
Cristina Monni Università della Svizzera Italiana, Mauro Pezze Università della Svizzera italiana (USI) (Switzerland) and Università degli Studi di Milano Bicocca (Italy)
Pre-print
12:20
10m
Talk
Discussion Period
Papers