Creator of Knowledge
Information Resources Management Association
Advancing the Concepts & Practices of Information Resources Management in Modern Organizations

Distributed Data Mining and its Applications to Intelligent Textual Information Processing

Distributed Data Mining and its Applications to Intelligent Textual Information Processing
View Free PDF
Author(s): Shibin Qiu (University of New Mexico, USA) and Mei Qiu (Emcore Corporation, USA)
Copyright: 2004
Pages: 5
Source title: Innovations Through Information Technology
Source Editor(s): Mehdi Khosrow-Pour, D.B.A. (Information Resources Management Association, USA)
DOI: 10.4018/978-1-59140-261-9.ch093
ISBN13: 9781616921255
EISBN13: 9781466665347


Textual information processing is of fundamental importance, due to the massive amount of documents, especially online textual information that we need to process every day. In this paper, we study data mining techniques applied to intelligent textual information processing in distributed environments, including text classification, information extraction (IE) and topic detection and tracking (TDT). These intelligent processing techniques will improve the quality and efficiency of information resource management and utilization. Their statistical models and computational algorithms challenge the researches in data mining and distributed/parallel computing. When successfully applied, they will help enhance and benefit applications in IT, digital library, and information retrieval. Specifically, we study the distributed computing of the following algorithms: naïve Bayes classifier combined with expectation-maximization (EM) for text classification, hidden Markov model for information extraction, and deterministic annealing with EM for topic detection and tracking. We also study the performances of the proposed algorithms and experiment on the improvements.

Body Bottom