IRMA-International.org: Creator of Knowledge
Information Resources Management Association
Advancing the Concepts & Practices of Information Resources Management in Modern Organizations

Word Sense Based Hindi-Tamil Statistical Machine Translation

Word Sense Based Hindi-Tamil Statistical Machine Translation
View Sample PDF
Author(s): Vimal Kumar K. (Jaypee Institute of Information Technology, India)and Divakar Yadav (Jaypee Institute of Information Technology, India)
Copyright: 2020
Pages: 12
Source title: Natural Language Processing: Concepts, Methodologies, Tools, and Applications
Source Author(s)/Editor(s): Information Resources Management Association (USA)
DOI: 10.4018/978-1-7998-0951-7.ch021

Purchase

View Word Sense Based Hindi-Tamil Statistical Machine Translation on the publisher's website for pricing and purchasing information.

Abstract

Corpus based natural language processing has emerged with great success in recent years. It is not only used for languages like English, French, Spanish, and Hindi but also is widely used for languages like Tamil, Telugu etc. This paper focuses to increase the accuracy of machine translation from Hindi to Tamil by considering the word's sense as well as its part-of-speech. This system works on word by word translation from Hindi to Tamil language which makes use of additional information such as the preceding words, the current word's part of speech and the word's sense itself. For such a translation system, the frequency of words occurring in the corpus, the tagging of the input words and the probability of the preceding word of the tagged words are required. Wordnet is used to identify various synonym for the words specified in the source language. Among these words, the one which is more relevant to the word specified in source language is considered for the translation to target language. The introduction of the additional information such as part-of-speech tag, preceding word information and semantic analysis has greatly improved the accuracy of the system.

Related Content

Reinaldo Padilha França, Ana Carolina Borges Monteiro, Rangel Arthur, Yuzo Iano. © 2021. 21 pages.
Abdul Kader Saiod, Darelle van Greunen. © 2021. 28 pages.
Aswini R., Padmapriya N.. © 2021. 22 pages.
Zubeida Khan, C. Maria Keet. © 2021. 21 pages.
Neha Gupta, Rashmi Agrawal. © 2021. 20 pages.
Kamalendu Pal. © 2021. 14 pages.
Joy Nkechinyere Olawuyi, Bernard Ijesunor Akhigbe, Babajide Samuel Afolabi, Attoh Okine. © 2021. 19 pages.
Body Bottom