Synchronizing Execution of Big Data in Distributed and Parallelized Environments

View Sample PDF

Author(s): Gueyoung Jung (Xerox Research Center Webster, USA)and Tridib Mukherjee (Xerox Research Center India, India)
Copyright: 2014
Pages: 25
Source title: Big Data Management, Technologies, and Applications
Source Author(s)/Editor(s): Wen-Chen Hu (University of North Dakota, USA)and Naima Kaabouch (University of North Dakota, USA)
DOI: 10.4018/978-1-4666-4699-5.ch003

Keywords: Data Mining and Databases / Databases / Information Science Reference / Library & Information Science

Purchase

View Synchronizing Execution of Big Data in Distributed and Parallelized Environments on the publisher's website for pricing and purchasing information.

Abstract

In the modern information era, the amount of data has exploded. Current trends further indicate exponential growth of data in the future. This prevalent humungous amount of data—referred to as big data—has given rise to the problem of finding the “needle in the haystack” (i.e., extracting meaningful information from big data). Many researchers and practitioners are focusing on big data analytics to address the problem. One of the major issues in this regard is the computation requirement of big data analytics. In recent years, the proliferation of many loosely coupled distributed computing infrastructures (e.g., modern public, private, and hybrid clouds, high performance computing clusters, and grids) have enabled high computing capability to be offered for large-scale computation. This has allowed the execution of the big data analytics to gather pace in recent years across organizations and enterprises. However, even with the high computing capability, it is a big challenge to efficiently extract valuable information from vast astronomical data. Hence, we require unforeseen scalability of performance to deal with the execution of big data analytics. A big question in this regard is how to maximally leverage the high computing capabilities from the aforementioned loosely coupled distributed infrastructure to ensure fast and accurate execution of big data analytics. In this regard, this chapter focuses on synchronous parallelization of big data analytics over a distributed system environment to optimize performance.

The IRMA Community

Research IRM

Synchronizing Execution of Big Data in Distributed and Parallelized Environments

Purchase

Abstract

Related Content

IRMA Sponsors