Contributed by: Shravya Nagineni
I am Shravya Nagineni, a detail-oriented software consultant with eight years of experience helping developers connect with users to produce solutions perfectly suited to client needs and desires. Comprehensive knowledge of the software development life cycle and expertise in generating insights from data is crucial in the role. The reason to pursue PGP DSBA at Great learning was to excel at analytical skills and identify an opportunity to apply the same in my current role. The course has not disappointed me, and I could apply my data science learnings at work.
I am a QA Lead working with a global leader in sensor, software, and autonomous technologies. Identifying & handling bugs in the systems and doing root cause analysis and a quality review is BAU work. A bug in the source code leads to an unintended state within the executing software, and this corrupted state eventually results in undesired external behavior. This is logged in as a bug report message in a change tracking system, often many months after the initial injection of the bug into the software.
Most of the time, I manually used to identify these bugs and assign them into categories based on my expertise and experience. Though, this time I decided to levy my data science expertise. I developed a classification model that predicts if the source code is clean or buggy.
Initially, I performed EDA and used visualizations to display the proportion of bugs and their relationship with different attributes such as recent change vs. old, minor/major, client vs. internal team, etc. This also helped do the root cause analysis and generate insights on possible reasons to leave a bug. I then developed a decision tree classifier and pruned it to classify bugs into three different categories- Clean, Low Buggy, and High Buggy. In my case, it was important that recall of a specific category (High Buggy) should be high. Therefore, I modified the threshold probability of the category to capture High Buggy with maximum recall.
The developed tool helped accurately predict whether a change was buggy or clean immediately after a change was made. This permitted developers to take steps to fix the introduced bugs immediately. Subsequently, it reduced the time required to find software bugs, as well as the time that bugs stay resident in software before removal. The accuracy of the model was 80%, with a recall of 90% for the High Buggy, which was very good.
Impact: This helped in achieving a 50% reduction in bugs and an 80% reduction in average TAT for identifying bugs by performing detailed impact analysis. The tool has been submitted for Star Project of the Quarter in my current organization.
If you wish to learn such concepts and implement them at work, enroll with Great learning’s Data Science and Business Analytics Course and upskill today.