IRMA-International.org: Creator of Knowledge
Information Resources Management Association
Advancing the Concepts & Practices of Information Resources Management in Modern Organizations

Chinese POS Disambiguation and Unknown Word Guessing with Lexicalized HMMs

Chinese POS Disambiguation and Unknown Word Guessing with Lexicalized HMMs
View Sample PDF
Author(s): Guohong Fu (The University of Hong Kong, Hong Kong)and Kang-Kwong Luke (The University of Hong Kong, Hong Kong)
Copyright: 2009
Pages: 13
Source title: Human Computer Interaction: Concepts, Methodologies, Tools, and Applications
Source Author(s)/Editor(s): Chee Siang Ang (City University of London, UK)and Panayiotis Zaphiris (City University of London, UK)
DOI: 10.4018/978-1-87828-991-9.ch101

Purchase

View Chinese POS Disambiguation and Unknown Word Guessing with Lexicalized HMMs on the publisher's website for pricing and purchasing information.

Abstract

This article presents a lexicalized HMM-based approach to Chinese part-of-speech (POS) disambiguation and unknown word guessing (UWG). In order to explore word-internal morphological features for Chinese POS tagging, four types of pattern tags are defined to indicate the way lexicon words are used in a segmented sentence. Such patterns are combined further with POS tags. Thus, Chinese POS disambiguation and UWG can be unified as a single task of assigning each known word to input a proper hybrid tag. Furthermore, a uniformly lexicalized HMM-based tagger also is developed to perform this task, which can incorporate both internal word-formation patterns and surrounding contextual information for Chinese POS tagging under the framework of HMMs. Experiments on the Peking University Corpus indicate that the tagging precision can be improved with efficiency by the proposed approach.

Related Content

Maja Pucelj, Matjaž Mulej, Anita Hrast. © 2024. 29 pages.
Hemendra Singh. © 2024. 26 pages.
Nestor Soler del Toro. © 2024. 27 pages.
Pablo Banchio. © 2024. 18 pages.
Jože Ruparčič. © 2024. 26 pages.
Anuttama Ghose, Hartej Singh Kochher, S. M. Aamir Ali. © 2024. 28 pages.
Bhupinder Singh, Komal Vig, Pushan Kumar Dutta, Christian Kaunert, Bhupendra Kumar Gautam. © 2024. 23 pages.
Body Bottom