A Statistical Framework for the Prediction of Fault-Proneness

View Sample PDF

Author(s): Yan Ma (West Virginia University, USA), Lan Guo (West Virginia University, USA)and Bojan Cukic (West Virginia University, USA)
Copyright: 2007
Pages: 27
Source title: Advances in Machine Learning Applications in Software Engineering
Source Author(s)/Editor(s): Du Zhang (California State University, USA)and Jeffery J.P. Tsai (University of Illinois at Chicago, USA)
DOI: 10.4018/978-1-59140-941-1.ch010

Keywords: Computational Intelligence / Computer Science & IT / Information Science Reference / Systems and Software Engineering

Purchase

View A Statistical Framework for the Prediction of Fault-Proneness on the publisher's website for pricing and purchasing information.

Abstract

Accurate prediction of fault-prone modules in software development process enables effective discovery and identification of the defects. Such prediction models are especially valuable for the large-scale systems, where verification experts need to focus their attention and resources to problem areas in the system under development. This chapter presents a methodology for predicting fault-prone modules using a modified random forests algorithm. Random forests improve classification accuracy by growing an ensemble of trees and letting them vote on the classification decision. We applied the methodology to five NASA public domain defect datasets. These datasets vary in size, but all typically contain a small number of defect samples. If overall accuracy maximization is the goal, then learning from such data usually results in a biased classifier. To obtain better prediction of fault-proneness, two strategies are investigated: proper sampling technique in constructing the tree classifiers, and threshold adjustment in determining the “winning” class. Both are found to be effective in accurate prediction of fault-prone modules. In addition, the chapter presents a thorough and statistically sound comparison of these methods against many other classifiers frequently used in the literature.

The IRMA Community

Research IRM

A Statistical Framework for the Prediction of Fault-Proneness

Purchase

Abstract

Related Content

IRMA Sponsors