Selection of an Optimal Set of Features for Bengali Character Recognition

View Sample PDF

Author(s): Hasan Sarwar (United International University, Bangladesh), Mizanur Rahman (Institute of Science and Technology (IST), Bangladesh), Nasreen Akter (St. Francis Xavier University, Canada), Saima Hossain (LEADS Corporation Limited, Bangladesh), Sabrina Ahmed (Local Government Engineering Department (LGED), Bangladesh)and Chowdhury Mofizur Rahman (United International University, Bangladesh)
Copyright: 2013
Pages: 21
Source title: Technical Challenges and Design Issues in Bangla Language Processing
Source Author(s)/Editor(s): M. A. Karim (Old Dominion University, USA), M. Kaykobad (Bangladesh University of Engineering & Technology, Bangladesh)and M. Murshed (Monash University, Australia)
DOI: 10.4018/978-1-4666-3970-6.ch005

Keywords: Artificial Intelligence / Computational Linguistics / Computer Science & IT / Information Science Reference

Purchase

View Selection of an Optimal Set of Features for Bengali Character Recognition on the publisher's website for pricing and purchasing information.

Abstract

Feature extraction is an essential step of Optical Character Recognition. Accurate and distinguishable feature plays a significant role to leverage the performance of a classifier. The complexity level of feature identification algorithm differs for alphabet sets of different languages. Apart from generic algorithms to find features of different alphabet sets, these algorithms take care of individual characteristic common for a particular alphabet set. Dominant features of one alphabet set might completely differ from that of another set. Since there always remains the chance that inaccurate features may cause inefficient recognition, special attention should be given to identify the set of optimal features of a character set. Bengali characters also have some specific issues apart from the existing issues of other character sets. For example, there are about 300 basic, modified, and compound character shapes in the script, the characters in a word are topologically connected, and Bengali is an inflectional language. Literature survey shows that several authors have used different features and classification algorithms. The authors have extensively reviewed all these feature sets. In order to identify an optimal feature set, variability analysis has been proposed here. They focus on the specific peculiarities of Bengali alphabet sets, its different usage as vowel and consonant signs, compound, complex, and touching characters. The authors also took care to generate easily computable features that take less time for generation. However, more attention needs to be given in order to choose an efficient classifier.

The IRMA Community

Research IRM

Selection of an Optimal Set of Features for Bengali Character Recognition

Purchase

Abstract

Related Content

IRMA Sponsors