Studying reopened bugs in open source software systems

Loading...
Thumbnail Image

Date

Authors

Tagra, Ankur

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Reopened bugs can degrade the overall reputation of a software system since such bugs lead to a loss of end-users trust regarding the quality of the software. Thus, understanding the characteristics of bugs that are reopened, and what factors are more likely to affect a reopened bug (especially post-release reopened bug) to be fixed rapidly throughout the release lifecycle, could provide insights in helping software developers to avoid/minimize such bugs.

In this thesis, we study the characteristics of reopened bugs and the factors that lead to a post-release reopened bug being fixed rapidly/slowly. To understand the characteristics of reopened bugs, prior studies built statistical or machine learning models to analyze the factors that impact the likelihood of a bug getting reopened. However, we observe several aspects that require further investigation from prior studies: 1) previously studied datasets are too small (only consisting of 3 projects) 2) 1 out of the 3 studied projects has a data leak issue. 3) the previously used experimental steps are outdated. After considering such aspects, we observe that only 34% of the studied projects give an acceptable performance with AUC $\geqslant$ 0.7 for predicting if a bug will be reopened. Moreover, we observe that post-release reopened bugs take only 189.1 hours rework time (time taken to resolve a reopened bug) as compared to 388.4 hours for rework time in pre-release reopened bugs. To study the likelihood of a post-release reopened bug getting fixed rapidly, we build prediction pipelines and observe that the models give an acceptable AUC of 0.78 to determine if a post-release reopened bug will get resolved rapidly/slowly. Our model predicts if a post-release reopened bug will get resolved rapidly (i.e., less than 3 minutes) or slowly (i.e., more than 4,538 hours) by considering top 20% fast resolved bugs as rapidly resolved and bottom 20% fast resolved bugs as slowly resolved.

Based on our findings, we encourage future research to leverage the rich data available during and after a bug is reopened, to understand the eventual resolution of a reopened bug and we also encourage researchers to consider pre-release and post-release reopened bugs separately in their analysis as studying reopened bugs as a whole leads to biased implications.

Description

Keywords

bug reports, reopened bugs, data quality, open source, Model interpretation, pre and post-release bugs

Citation

Endorsement

Review

Supplemented By

Referenced By