Anatomizing Lexicon With Natural Language Tokenizer Toolkit 3

View Sample PDF

Author(s): Simran Kaur Jolly (Manav Rachna International Institute of Research and Studies, India)and Rashmi Agrawal (Manav Rachna International Institute of Research and Studies, India)
Copyright: 2019
Pages: 35
Source title: Extracting Knowledge From Opinion Mining
Source Author(s)/Editor(s): Rashmi Agrawal (Manav Rachna International Institute of Research and Studies, India)and Neha Gupta (Manav Rachna International Institute of Research and Studies, India)
DOI: 10.4018/978-1-5225-6117-0.ch011

Keywords: Data Mining / Data Mining and Databases / Information Science Reference / Library & Information Science

Purchase

View Anatomizing Lexicon With Natural Language Tokenizer Toolkit 3 on the publisher's website for pricing and purchasing information.

Abstract

NLTK toolkit is an API platform built with Python language to interact with humans through natural language. The very first version of NLTK was released in 2005 (1.4.3), which was compatible with Python 2.4. The latest version was in September 2017 NLTK (3.2.5), which incorporated features like Arabic stemmers, NIST evaluation, MOSES tokenizer, Stanford segmenter, treebank detokenizer, verbnet, and vader, etc. NLTK was created in 2001 as a part of Computational Linguistic Department at the University of Pennsylvania. Since then it has been tested and developed. The important packages of this system are 1) corpus builder, 2) tokenizer, 3) collocation, 4) tagging, 5) parsing, 6) metrics, and 7) probability distribution system. Toolbox NLTK was built to meet four primary requirements: 1) Simplicity: An substantive framework for building blocks; 2) Consistency: Consistent interface; 3) Extensibility: Which can be easily scaled; and 4) Modularity: All modules are independent of each other.

The IRMA Community

Research IRM

Anatomizing Lexicon With Natural Language Tokenizer Toolkit 3

Purchase

Abstract

Related Content

IRMA Sponsors