IRMA-International.org: Creator of Knowledge
Information Resources Management Association
Advancing the Concepts & Practices of Information Resources Management in Modern Organizations

Data Management in Scientific Workflows

Data Management in Scientific Workflows
View Sample PDF
Author(s): Ewa Deelman (University of Southern California, USA)and Ann Chervenak (University of Southern California, USA)
Copyright: 2012
Pages: 11
Source title: Data Intensive Distributed Computing: Challenges and Solutions for Large-scale Information Management
Source Author(s)/Editor(s): Tevfik Kosar (University at Buffalo, USA)
DOI: 10.4018/978-1-61520-971-2.ch008

Purchase

View Data Management in Scientific Workflows on the publisher's website for pricing and purchasing information.

Abstract

Scientific applications such as those in astronomy, earthquake science, gravitational-wave physics, and others have embraced workflow technologies to do large-scale science. Workflows enable researchers to collaboratively design, manage, and obtain results that involve hundreds of thousands of steps, access terabytes of data, and generate similar amounts of intermediate and final data products. Although workflow systems are able to facilitate the automated generation of data products, many issues still remain to be addressed. These issues exist in different forms in the workflow lifecycle. This chapter describes a workflow lifecycle as consisting of a workflow generation phase where the analysis is defined, the workflow planning phase where resources needed for execution are selected, the workflow execution part, where the actual computations take place, and the result, metadata, and provenance storing phase. The authors discuss the issues related to data management at each step of the workflow cycle. They describe challenge problems and illustrate them in the context of real-life applications. They discuss the challenges, possible solutions, and open issues faced when mapping and executing large-scale workflows on current cyberinfrastructure. They particularly emphasize the issues related to the management of data throughout the workflow lifecycle.

Related Content

Radhika Kavuri, Satya kiranmai Tadepalli. © 2024. 19 pages.
Ramu Kuchipudi, Ramesh Babu Palamakula, T. Satyanarayana Murthy. © 2024. 10 pages.
Nidhi Niraj Worah, Megharani Patil. © 2024. 21 pages.
Vishal Goar, Nagendra Singh Yadav. © 2024. 23 pages.
S. Boopathi. © 2024. 24 pages.
Sai Samin Varma Pusapati. © 2024. 25 pages.
Swapna Mudrakola, Krishna Keerthi Chennam, Shitharth Selvarajan. © 2024. 11 pages.
Body Bottom