IRMA-International.org: Creator of Knowledge
Information Resources Management Association
Advancing the Concepts & Practices of Information Resources Management in Modern Organizations

An Efficient Algorithm for Data Cleaning

An Efficient Algorithm for Data Cleaning
View Sample PDF
Author(s): Payal Pahwa (Guru Gobind Singh IndraPrastha University, India), Rajiv Arora (Guru Gobind Singh IndraPrastha University, India)and Garima Thakur (Guru Gobind Singh IndraPrastha University, India)
Copyright: 2013
Pages: 16
Source title: Intelligence Methods and Systems Advancements for Knowledge-Based Business
Source Author(s)/Editor(s): John Wang (Montclair State University, USA)
DOI: 10.4018/978-1-4666-1873-2.ch017

Purchase

View An Efficient Algorithm for Data Cleaning on the publisher's website for pricing and purchasing information.

Abstract

The quality of real world data that is being fed into a data warehouse is a major concern of today. As the data comes from a variety of sources before loading the data in the data warehouse, it must be checked for errors and anomalies. There may be exact duplicate records or approximate duplicate records in the source data. The presence of incorrect or inconsistent data can significantly distort the results of analyses, often negating the potential benefits of information-driven approaches. This paper addresses issues related to detection and correction of such duplicate records. Also, it analyzes data quality and various factors that degrade it. A brief analysis of existing work is discussed, pointing out its major limitations. Thus, a new framework is proposed that is an improvement over the existing technique.

Related Content

. © 2023. 11 pages.
. © 2023. 19 pages.
. © 2023. 25 pages.
. © 2023. 14 pages.
. © 2023. 26 pages.
. © 2023. 17 pages.
. © 2023. 15 pages.
Body Bottom