The IRMA Community
Newsletters
Research IRM
Click a keyword to search titles using our InfoSci-OnDemand powered search:
|
Bayesian Belief Networks for Data Cleaning
|
Author(s): Enrico Fagiuoli (Università degli Studi di Milano-Bicocca, Italy), Sara Omerino (ETNOTEAM S.p.A., Italy)and Fabio Stella (Università degli Studi di Milano-Bicocca, Italy)
Copyright: 2008
Pages: 16
Source title:
Mathematical Methods for Knowledge Discovery and Data Mining
Source Author(s)/Editor(s): Giovanni Felici (Consiglio Nazionale delle Richerche, Italy)and Carlo Vercellis (Politecnico di Milano, Italy)
DOI: 10.4018/978-1-59904-528-3.ch012
Purchase
|
Abstract
The importance of data cleaning and data quality is becoming increasingly clear as evidenced by the surge in software, tools, consulting companies and seminars addressing data quality issues. In this contribution the authors present and describe how Bayesian computational techniques can be exploited for data cleaning purposes to the extent of reducing the time to clean and understand the data. The proposed approach relies on the computational device named Bayesian belief network, which is a general statistical model that allows the efficient description and treatment of joint probability distributions. This work describes the conceptual framework that maps the Bayesian belief network computational device to some of the most difficult tasks in data cleaning, namely imputing missing values, completing truncated datasets and outliers detection. The proposed framework is described and supported by a set of numerical experiments performed by exploiting the Bayesian belief network programming suite named HUGIN.
Related Content
Murray Eugene Jennex.
© 2020.
29 pages.
|
Ronald John Lofaro.
© 2020.
18 pages.
|
Mark E. Nissen.
© 2020.
23 pages.
|
Ronel Davel, Adeline S. A. Du Toit, Martie Mearns.
© 2020.
32 pages.
|
Murray Eugene Jennex.
© 2020.
23 pages.
|
Michael J. Zhang.
© 2020.
21 pages.
|
Toshali Dey, Susmita Mukhopadhyay.
© 2020.
23 pages.
|
|
|