IRMA-International.org: Creator of Knowledge
Information Resources Management Association
Advancing the Concepts & Practices of Information Resources Management in Modern Organizations

An Extensive Text Mining Study for the Turkish Language: Author Recognition, Sentiment Analysis, and Text Classification

An Extensive Text Mining Study for the Turkish Language: Author Recognition, Sentiment Analysis, and Text Classification
View Sample PDF
Author(s): Durmuş Özkan Şahin (Ondokuz Mayıs University, Turkey)and Erdal Kılıç (Ondokuz Mayıs University, Turkey)
Copyright: 2021
Pages: 35
Source title: Natural Language Processing for Global and Local Business
Source Author(s)/Editor(s): Fatih Pinarbasi (Istanbul Medipol University, Turkey)and M. Nurdan Taskiran (Istanbul Medipol University, Turkey)
DOI: 10.4018/978-1-7998-4240-8.ch012

Purchase


Abstract

In this study, the authors give both theoretical and experimental information about text mining, which is one of the natural language processing topics. Three different text mining problems such as news classification, sentiment analysis, and author recognition are discussed for Turkish. They aim to reduce the running time and increase the performance of machine learning algorithms. Four different machine learning algorithms and two different feature selection metrics are used to solve these text classification problems. Classification algorithms are random forest (RF), logistic regression (LR), naive bayes (NB), and sequential minimal optimization (SMO). Chi-square and information gain metrics are used as the feature selection method. The highest classification performance achieved in this study is 0.895 according to the F-measure metric. This result is obtained by using the SMO classifier and information gain metric for news classification. This study is important in terms of comparing the performances of classification algorithms and feature selection methods.

Related Content

Wasswa Shafik. © 2024. 25 pages.
Muthmainnah Muthmainnah, Eka Apriani, Prodhan Mahbub Ibna Seraj, Ahmed J. Obaid, Ahmad M. Al Yakin. © 2024. 17 pages.
Arkar Htet, Sui Reng Liana, Theingi Aung, Amiya Bhaumik. © 2024. 26 pages.
Shwetha Baliga, Harshith K. Murthy, Apoorv Sadhale, Dhruti Upadhyaya. © 2024. 18 pages.
Manoj Kumar Pandey, Jyoti Upadhyay. © 2024. 21 pages.
R. Angeline, S. Aarthi, Rishabh Jain, Muzamil Faisal, Abishek Venkatesan, R. Regin. © 2024. 16 pages.
Gagan Deep, Jyoti Verma. © 2024. 20 pages.
Body Bottom