The IRMA Community
Newsletters
Research IRM
Click a keyword to search titles using our InfoSci-OnDemand powered search:
|
Anatomizing Lexicon With Natural Language Tokenizer Toolkit 3
|
Author(s): Simran Kaur Jolly (Manav Rachna International Institute of Research and Studies, India)and Rashmi Agrawal (Manav Rachna International Institute of Research and Studies, India)
Copyright: 2019
Pages: 35
Source title:
Extracting Knowledge From Opinion Mining
Source Author(s)/Editor(s): Rashmi Agrawal (Manav Rachna International Institute of Research and Studies, India)and Neha Gupta (Manav Rachna International Institute of Research and Studies, India)
DOI: 10.4018/978-1-5225-6117-0.ch011
Purchase
|
Abstract
NLTK toolkit is an API platform built with Python language to interact with humans through natural language. The very first version of NLTK was released in 2005 (1.4.3), which was compatible with Python 2.4. The latest version was in September 2017 NLTK (3.2.5), which incorporated features like Arabic stemmers, NIST evaluation, MOSES tokenizer, Stanford segmenter, treebank detokenizer, verbnet, and vader, etc. NLTK was created in 2001 as a part of Computational Linguistic Department at the University of Pennsylvania. Since then it has been tested and developed. The important packages of this system are 1) corpus builder, 2) tokenizer, 3) collocation, 4) tagging, 5) parsing, 6) metrics, and 7) probability distribution system. Toolbox NLTK was built to meet four primary requirements: 1) Simplicity: An substantive framework for building blocks; 2) Consistency: Consistent interface; 3) Extensibility: Which can be easily scaled; and 4) Modularity: All modules are independent of each other.
Related Content
.
© 2023.
34 pages.
|
.
© 2023.
15 pages.
|
.
© 2023.
15 pages.
|
.
© 2023.
18 pages.
|
.
© 2023.
24 pages.
|
.
© 2023.
32 pages.
|
.
© 2023.
21 pages.
|
|
|