Information Resources Management Association
Advancing the Concepts & Practices of Information Resources Management in Modern Organizations
Search IRMA Research
Research IRM
Open Access
IRMA Journals
IRM Books
Proceedings
Membership
The
IRMA
Community
Calls for Papers
Online Symposium
Newsletters
Research IRM
Click a keyword to search titles using our InfoSci-OnDemand powered search:
Multimedia IS
Healthcare
Information Systems
Library Science
Education
Environmental IS
Social Science
Computer Science
Business Management
Electronic Commerce
Misplacing the Code: An Examination of Data Quality Issues in Bayesian Text Classification for Automated Coding of Medical Diagnoses
View Free PDF
Author(s):
Eitel J.M. Lauria (Marist College, USA)and Alan D. March (Universidad del Salvador, Argentina)
Copyright:
2007
Pages:
3
Source title:
Managing Worldwide Operations and Communications with Information Technology
Source Editor(s):
Mehdi Khosrow-Pour, D.B.A.
(Information Resources Management Association, USA)
DOI:
10.4018/978-1-59904-929-8.ch296
ISBN13:
9781599049298
EISBN13:
9781466665378
Keywords:
Information Science Reference
/
IT Research & Theory
/
IT Research and Theory
/
Library & Information Science
Abstract
In this article we discuss the effect of dirty data on text mining for automated coding of medical diagnoses. Using two Bayesian machine learning algorithms (naive Bayes and shrinkage) we build ICD9-CM classification models trained from free-text diagnoses. We investigate the effect of training the classifiers using both clean and (simulated) dirty data. The research focuses on the impact that erroneous labeling of training data sets has on the classifiers’ predictive accuracy.
IRMA
Offers Over
2,500
Full Text
Open Access Research
Papers for Free Download
Click to Start Searching
Free IRM Research!
IRMA Sponsors
About Us
|
Contact
|
Sitemap
|
Legal
|
Policies
Copyright ©2023, Information Resources Management Association. 701 East Chocolate Avenue, Hershey, PA 17033.