With abundance of textual information on the web, text and web mining is increasingly important. Although search technologies have matured and getting relevant documents or web pages is not difficult any more, information overload has never ceased to be a roadblock for users. More advanced text applications are needed to bring out novel and useful information or knowledge hidden in the sea of documents. The purpose of this handbook is to present most recent advances and survey of applications in text and web mining which should be of interests to researchers and end-users alike. With that in mind, we invited submissions to Handbook of Research in Text and Web Mining. Based on the content, we organized this handbook into five sections which represent the major topic areas in text and web mining.
Section titles and their highlights:
Section 1 Document Preprocessing, concerns steps on obtaining key textual elements and their weights before mining occurs. This section covers various operations to transform text into the next step including lexical analysis, elimination of functional words, stemming, identification of key terms and phrases, and document representation.
Section 2 Classification and Clustering, discusses two popular mining methods and their applications in text and web mining. In this section, we present state-of-the-art classification and clustering techniques applied to several interesting problem domains such as syllabus, protein, and image classification.
Section 3 Database, Ontology and the Web, presents topics relating to three types of objects and their use in the mining processing either as data or supplemental information to improve mining performance. This section presents a variety of research issues and problems associated with database, ontology, and the web from text and web mining perspective.
Section 4 Information Retrieval and Extraction, illustrates how mining techniques can be used to enhance performance of information retrieval and extraction. This section presents that how text and web mining techniques contribute to resolve difficult problems of information retrieval and extraction.
Section 5 Applications and Survey, concludes the book with surveys on latest research and end-user applications.
All the chapters are opened with an overview and concluded with references and author biography. It will be beneficial for readers of this handbook to have basic understanding of natural language processing and college statistics, since we consider both subjects the foundation of text and web mining. The research in this area is developing rapidly. Therefore, by no means that this handbook is the ultimate research report on what text and web mining can achieve. It is hoped that this handbook will serve as catalyst to innovative ideas and thus make exciting research in this and complimentary research areas fruitful in the near future.