The IRMA Community
Newsletters
Research IRM
Click a keyword to search titles using our InfoSci-OnDemand powered search:
|
A Comparison of Revision Schemes for Cleaning Labeling Noise
Abstract
Data quality is an important factor in building effective classifiers. One way to improve data quality is by cleaning labeling noise. Label cleaning can be divided into two stages. The first stage identifies samples with suspicious labels. The second stage processes the suspicious samples using some revision scheme. This chapter examines three such revision schemes: (1) removal of the suspicious samples, (2) automatic replacement of the suspicious labels to what the machine believes to be correct, and (3) escalation of the suspicious samples to a human supervisor for relabeling. Experimental and theoretical analyses show that only escalation is effective when the original labeling noise is very large or very small. Furthermore, for a wide range of situations, removal is better than automatic replacement.
Related Content
Murray Eugene Jennex.
© 2020.
29 pages.
|
Ronald John Lofaro.
© 2020.
18 pages.
|
Mark E. Nissen.
© 2020.
23 pages.
|
Ronel Davel, Adeline S. A. Du Toit, Martie Mearns.
© 2020.
32 pages.
|
Murray Eugene Jennex.
© 2020.
23 pages.
|
Michael J. Zhang.
© 2020.
21 pages.
|
Toshali Dey, Susmita Mukhopadhyay.
© 2020.
23 pages.
|
|
|