The IRMA Community
Newsletters
Research IRM
Click a keyword to search titles using our InfoSci-OnDemand powered search:
|
On the Usage of Structural Information in Constrained Semi-Supervised Clustering of XML Documents
|
Author(s): Eduardo Bezerra (CEFET/RJ, Federal Center of Technological Education CSF, Brazil), Geraldo Xexéo (Programa de Sistemas, COPPE, UFRJ, Institute of Mathematics, UFRJ, Brazil)and Marta Mattoso (Programa de Sistemas, COPPE/UFRJ, Brazil)
Copyright: 2008
Pages: 20
Source title:
Successes and New Directions in Data Mining
Source Author(s)/Editor(s): Pascal Poncelet (Ecole des Mines d'Ales, France), Florent Masseglia (Project AxIS-INRIA, France)and Maguelonne Teisseire (Universite Montpellier, France)
DOI: 10.4018/978-1-59904-645-7.ch004
Purchase
|
Abstract
In this chapter, we consider the problem of constrained clustering of documents. We focus on documents that present some form of structural information, in which prior knowledge is provided. Such structured data can guide the algorithm to a better clustering model. We consider the existence of a particular form of information to be clustered: textual documents that present a logical structure represented in XML format. Based on this consideration, we present algorithms that take advantage of XML metadata (structural information), thus improving the quality of the generated clustering models. This chapter also addresses the problem of inconsistent constraints and defines algorithms that eliminate inconsistencies, also based on the existence of structural information associated to the XML document collection.
Related Content
.
© 2023.
34 pages.
|
.
© 2023.
15 pages.
|
.
© 2023.
15 pages.
|
.
© 2023.
18 pages.
|
.
© 2023.
24 pages.
|
.
© 2023.
32 pages.
|
.
© 2023.
21 pages.
|
|
|