IRMA-International.org: Creator of Knowledge
Information Resources Management Association
Advancing the Concepts & Practices of Information Resources Management in Modern Organizations

Distributional Semantic Model Based on Convolutional Neural Network for Arabic Textual Similarity

Distributional Semantic Model Based on Convolutional Neural Network for Arabic Textual Similarity
View Sample PDF
Author(s): Adnen Mahmoud (Higher Institute of Computer Science and Communication Techniques, Monastir, Tunisia) and Mounir Zrigui (Faculty of Science Monastir, Monastir, Tunisia)
Copyright: 2020
Volume: 14
Issue: 1
Pages: 16
Source title: International Journal of Cognitive Informatics and Natural Intelligence (IJCINI)
Editor(s)-in-Chief: Kangshun Li (South China Agricultural University, China)
DOI: 10.4018/IJCINI.2020010103

Purchase

View Distributional Semantic Model Based on Convolutional Neural Network for Arabic Textual Similarity on the publisher's website for pricing and purchasing information.

Abstract

The problem addressed is to develop a model that can reliably identify whether a previously unseen document pair is paraphrased or not. Its detection in Arabic documents is a challenge because of its variability in features and the lack of publicly available corpora. Faced with these problems, the authors propose a semantic approach. At the feature extraction level, the authors use global vectors representation combining global co-occurrence counting and a contextual skip gram model. At the paraphrase identification level, the authors apply a convolutional neural network model to learn more contextual and semantic information between documents. For experiments, the authors use Open Source Arabic Corpora as a source corpus. Then the authors collect different datasets to create a vocabulary model. For the paraphrased corpus construction, the authors replace each word from the source corpus by its most similar one which has the same grammatical class applying the word2vec algorithm and the part-of-speech annotation. Experiments show that the model achieves promising results in terms of precision and recall compared to existing approaches in the literature.

Related Content

Alae Chouiekh, El Hassane Ibn El Haj. © 2020. 16 pages.
Maryam Ghanbari, Witold Kinsner. © 2020. 18 pages.
Adnen Mahmoud, Mounir Zrigui. © 2020. 16 pages.
Jun Ye. © 2020. 12 pages.
Wang Ke Feng, Sheng Xiao Chun. © 2020. 12 pages.
Ying Huang, Liyun Zhong, Yan Chen. © 2020. 15 pages.
Gaurav Aggarwal, Latika Singh. © 2020. 19 pages.
Body Bottom