IRMA-International.org: Creator of Knowledge
Information Resources Management Association
Advancing the Concepts & Practices of Information Resources Management in Modern Organizations

A Comparison of Revision Schemes for Cleaning Labeling Noise

A Comparison of Revision Schemes for Cleaning Labeling Noise
View Sample PDF
Author(s): Chuck P. Lam (Lama Solutions LLC., USA)and David G. Stork (Ricoh Innovations, Inc., USA)
Copyright: 2008
Pages: 13
Source title: Mathematical Methods for Knowledge Discovery and Data Mining
Source Author(s)/Editor(s): Giovanni Felici (Consiglio Nazionale delle Richerche, Italy)and Carlo Vercellis (Politecnico di Milano, Italy)
DOI: 10.4018/978-1-59904-528-3.ch013

Purchase

View A Comparison of Revision Schemes for Cleaning Labeling Noise on the publisher's website for pricing and purchasing information.

Abstract

Data quality is an important factor in building effective classifiers. One way to improve data quality is by cleaning labeling noise. Label cleaning can be divided into two stages. The first stage identifies samples with suspicious labels. The second stage processes the suspicious samples using some revision scheme. This chapter examines three such revision schemes: (1) removal of the suspicious samples, (2) automatic replacement of the suspicious labels to what the machine believes to be correct, and (3) escalation of the suspicious samples to a human supervisor for relabeling. Experimental and theoretical analyses show that only escalation is effective when the original labeling noise is very large or very small. Furthermore, for a wide range of situations, removal is better than automatic replacement.

Related Content

Murray Eugene Jennex. © 2020. 29 pages.
Ronald John Lofaro. © 2020. 18 pages.
Mark E. Nissen. © 2020. 23 pages.
Ronel Davel, Adeline S. A. Du Toit, Martie Mearns. © 2020. 32 pages.
Murray Eugene Jennex. © 2020. 23 pages.
Michael J. Zhang. © 2020. 21 pages.
Toshali Dey, Susmita Mukhopadhyay. © 2020. 23 pages.
Body Bottom