The IRMA Community
Newsletters
Research IRM
Click a keyword to search titles using our InfoSci-OnDemand powered search:
|
Hadoop Tools
Abstract
As the name indicates, this chapter explains the various additional tools provided by Hadoop. The additional tools provided by Hadoop distribution are Hadoop Streaming, Hadoop Archives, DistCp, Rumen, GridMix, and Scheduler Load Simulator. Hadoop Streaming is a utility that allows the user to have any executable or script for both mapper and reducer. Hadoop Archives is used for archiving old files and directories. DistCp is used for copying files within the cluster and also across different clusters. Rumen is the tool for extracting meaningful data from JobHistory files and analyzes it. It is used for statistical analysis. GridMix is benchmark for Hadoop. It takes a trace of job and creates a synthetic job with the same pattern as that of trace. The trace can be generated by Rumen tool. Scheduler Load Simulator is a tool for simulating different loads and scheduling methods like FIFO, Fair Scheduler, etc. This chapter explains all the tools and gives the syntax of various commands for each tool. After reading this chapter, the reader will be able to use all these tools effectively.
Related Content
N. Geethanjali, K. M. Ashifa, Avantika Raina, Jayashree Patil, Rameshwaran Byloppilly, S. Suman Rajest.
© 2024.
19 pages.
|
Praveen Kakada, Muhammed Shafi M. K..
© 2024.
14 pages.
|
P. S. Venkateswaran, Divya Marupaka, Sachin Parate, Amit Bhanushali, Latha Thammareddi, P. Paramasivan.
© 2024.
15 pages.
|
M. Lishmah Dominic, P. S. Venkateswaran, Latha Thamma Reddi, Sandeep Rangineni, R. Regin, S. Suman Rajest.
© 2024.
15 pages.
|
S. Sivabala, P. Vidyasri.
© 2024.
23 pages.
|
H. Hajra, G. Jayalakshmi.
© 2024.
22 pages.
|
Anusha Thakur.
© 2024.
15 pages.
|
|
|