IRMA-International.org: Creator of Knowledge
Information Resources Management Association
Advancing the Concepts & Practices of Information Resources Management in Modern Organizations

Optimizing Semi-Stream CACHEJOIN for Near-Real- Time Data Warehousing

Optimizing Semi-Stream CACHEJOIN for Near-Real- Time Data Warehousing
View Sample PDF
Author(s): M. Asif Naeem (School of Engineering, Computer and Mathematical Sciences, Auckland University of Technology, Auckland, New Zealand), Erum Mehmood (School of Science and Technology, University of Management and Technology, Lahore, Pakistan), M. G. Abbas Malik (Universal College of Learning, Palmerston North, New Zealand) and Noreen Jamil (National University FAST, Islamabad, Pakistan)
Copyright: 2020
Volume: 31
Issue: 1
Pages: 18
Source title: Journal of Database Management (JDM)
Editor(s)-in-Chief: Keng Siau (Missouri University of Science and Technology, USA)
DOI: 10.4018/JDM.2020010102

Purchase

View Optimizing Semi-Stream CACHEJOIN for Near-Real- Time Data Warehousing on the publisher's website for pricing and purchasing information.

Abstract

Streaming data join is a critical process in the field of near-real-time data warehousing. For this purpose, an adaptive semi-stream join algorithm called CACHEJOIN (Cache Join) focusing non-uniform stream data is provided in the literature. However, this algorithm cannot exploit the memory and CPU resources optimally and consequently it leaves its service rate suboptimal due to sequential execution of both of its phases, called stream-probing (SP) phase and disk-probing (DP) phase. By integrating the advantages of CACHEJOIN, this article presents two modifications for it. The first is called P-CACHEJOIN (Parallel Cache Join) that enables the parallel processing of two phases in CACHEJOIN. This increases number of joined stream records and therefore improves throughput considerably. The second is called OP-CACHEJOIN (Optimized Parallel Cache Join) that implements a parallel loading of stored data into memory while the DP phase is executing. This research presents the performance analysis of both of the approaches defined within the paper existing CACHEJOIN empirically using synthetic skewed dataset.

Related Content

Qingqing Zhou, Ming Jing. © 2020. 19 pages.
Canchu Lin, Anand S. Kunnathur, Long Li. © 2020. 21 pages.
Leigh A. Mutchler, Merrill Warkentin. © 2020. 20 pages.
Hemang Chamakuzhi Subramanian, Suresh Malladi. © 2020. 26 pages.
M. Asif Naeem, Erum Mehmood, M. G. Abbas Malik, Noreen Jamil. © 2020. 18 pages.
Amrita George, Kurt Schmitz, Veda C. Storey. © 2020. 26 pages.
Mark L. Gillenson, Thomas F. Stafford, Xihui “Paul” Zhang, Yao Shi. © 2020. 22 pages.
Body Bottom