Hadoop History and Architecture

View Sample PDF

Copyright: 2019
Pages: 13
Source title: Big Data Processing With Hadoop
Source Author(s)/Editor(s): T. Revathi (Mepco Schlenk Engineering College, India), K. Muneeswaran (Mepco Schlenk Engineering College, India)and M. Blessa Binolin Pepsi (Mepco Schlenk Engineering College, India)
DOI: 10.4018/978-1-5225-3790-8.ch003

Keywords: Data Analysis & Statistics / Data Mining and Databases / Information Science Reference / Library & Information Science

Purchase

View Hadoop History and Architecture on the publisher's website for pricing and purchasing information.

Abstract

As the name indicates, this chapter explains the evolution of Hadoop. Doug Cutting started a text search library called Lucene. After joining Apache Software Foundation, he modified it into a web crawler called Apache Nutch. Then Google File System was taken as reference and modified as Nutch Distributed File System. Then Google's MapReduce features were also integrated and Hadoop was framed. The whole path from Lucene to Apache Hadoop is illustrated in this chapter. Also, the different versions of Hadoop are explained. The procedure to download the software is explained. The mechanism to verify the downloaded software is shown. Then the architecture of Hadoop is detailed. The Hadoop cluster is a set of commodity machines grouped together. The arrangement of Hadoop machines in different racks is shown. After reading this chapter, the reader will understand how Hadoop has evolved and its entire architecture.

The IRMA Community

Research IRM

Hadoop History and Architecture

Purchase

Abstract

Related Content

IRMA Sponsors