IRMA-International.org: Creator of Knowledge
Information Resources Management Association
Advancing the Concepts & Practices of Information Resources Management in Modern Organizations

Optimizing Semi-Stream CACHEJOIN for Near-Real- Time Data Warehousing

Optimizing Semi-Stream CACHEJOIN for Near-Real- Time Data Warehousing
View Sample PDF
Author(s): M. Asif Naeem (School of Engineering, Computer and Mathematical Sciences, Auckland University of Technology, Auckland, New Zealand), Erum Mehmood (School of Science and Technology, University of Management and Technology, Lahore, Pakistan), M. G. Abbas Malik (Universal College of Learning, Palmerston North, New Zealand)and Noreen Jamil (National University FAST, Islamabad, Pakistan)
Copyright: 2020
Volume: 31
Issue: 1
Pages: 18
Source title: Journal of Database Management (JDM)
Editor(s)-in-Chief: Keng Siau (City University of Hong Kong, Hong Kong SAR)
DOI: 10.4018/JDM.2020010102

Purchase

View Optimizing Semi-Stream CACHEJOIN for Near-Real- Time Data Warehousing on the publisher's website for pricing and purchasing information.

Abstract

Streaming data join is a critical process in the field of near-real-time data warehousing. For this purpose, an adaptive semi-stream join algorithm called CACHEJOIN (Cache Join) focusing non-uniform stream data is provided in the literature. However, this algorithm cannot exploit the memory and CPU resources optimally and consequently it leaves its service rate suboptimal due to sequential execution of both of its phases, called stream-probing (SP) phase and disk-probing (DP) phase. By integrating the advantages of CACHEJOIN, this article presents two modifications for it. The first is called P-CACHEJOIN (Parallel Cache Join) that enables the parallel processing of two phases in CACHEJOIN. This increases number of joined stream records and therefore improves throughput considerably. The second is called OP-CACHEJOIN (Optimized Parallel Cache Join) that implements a parallel loading of stored data into memory while the DP phase is executing. This research presents the performance analysis of both of the approaches defined within the paper existing CACHEJOIN empirically using synthetic skewed dataset.

Related Content

Pasi Raatikainen, Samuli Pekkola, Maria Mäkelä. © 2024. 30 pages.
Zhongliang Li, Yaofeng Tu, Zongmin Ma. © 2024. 25 pages.
Zongmin Ma, Daiyi Li, Jiawen Lu, Ruizhe Ma, Li Yan. © 2024. 32 pages.
Lavlin Agrawal, Pavankumar Mulgund, Raj Sharman. © 2024. 37 pages.
Jizi Li, Xiaodie Wang, Justin Z. Zhang, Longyu Li. © 2024. 34 pages.
Amit Singh, Jay Prakash, Gaurav Kumar, Praphula Kumar Jain, Loknath Sai Ambati. © 2024. 25 pages.
Ruizhe Ma, Weiwei Zhou, Zongmin Ma. © 2024. 21 pages.
Body Bottom