IRMA-International.org: Creator of Knowledge
Information Resources Management Association
Advancing the Concepts & Practices of Information Resources Management in Modern Organizations

Text Mining-Machine Learning on Documents

Text Mining-Machine Learning on Documents
View Sample PDF
Author(s): Dunja Mladenic (Jozef Stefan Institute, Slovenia)
Copyright: 2005
Pages: 4
Source title: Encyclopedia of Data Warehousing and Mining
Source Author(s)/Editor(s): John Wang (Montclair State University, USA)
DOI: 10.4018/978-1-59140-557-3.ch208

Purchase

View Text Mining-Machine Learning on Documents on the publisher's website for pricing and purchasing information.

Abstract

Intensive usage and growth of the World Wide Web and the daily increasing amount of text information in electronic form have resulted in a growing need for computer-supported ways of dealing with text data. One of the most popular problems addressed with text mining methods is document categorization. Document categorization aims to classify documents into pre-defined categories, based on their content. Other important problems addressed in text mining include document search, based on the content, automatic document summarization, automatic document clustering and construction of document hierarchies, document authorship detection, identification of plagiarism of documents, topic identification and tracking, information extraction, hypertext analysis, and user profiling. If we agree on text mining being a fairly broad area dealing with computer-supported analysis of text, then the list of problems that can be addressed is rather long and open. Here we adopt this fairly open view but concentrate on the parts related to automatic data analysis and data mining.

Related Content

Md Sakir Ahmed, Abhijit Bora. © 2024. 15 pages.
Lakshmi Haritha Medida, Kumar. © 2024. 18 pages.
Gypsy Nandi, Yadika Prasad. © 2024. 16 pages.
Saurav Bhattacharjee, Sabiha Raiyesha. © 2024. 14 pages.
Naren Kathirvel, Kathirvel Ayyaswamy, B. Santhoshi. © 2024. 26 pages.
K. Sudha, C. Balakrishnan, T. P. Anish, T. Nithya, B. Yamini, R. Siva Subramanian, M. Nalini. © 2024. 25 pages.
Sabiha Raiyesha, Papul Changmai. © 2024. 28 pages.
Body Bottom