Search Swinburne Research Bank
Please use this identifier to cite or link to this item: http://hdl.handle.net/1959.3/80896
- Title
- A data placement strategy in scientific cloud workflows
- Author(s)
- Yuan, Dong; Yang, Yun; Liu, Xiao; Chen, Jinjun
- Abstract
- In scientific cloud workflows, large amounts of application data need to be stored in distributed data centres. To effectively store these data, a data manager must intelligently select data centres in which these data will reside. This is, however, not the case for data which must have a fixed location. When one task needs several datasets located in different data centres, the movement of large volumes of data becomes a challenge. In this paper, we propose a matrix based k-means clustering strategy for data placement in scientific cloud workflows. The strategy contains two algorithms that group the existing datasets in k data centres during the workflow build-time stage, and dynamically clusters newly generated datasets to the most appropriate data centres-based on dependencies-during the runtime stage. Simulations show that our algorithm can effectively reduce data movement during the workflow's execution.
- Publication type
- Journal article
- Research centre
- Swinburne University of Technology. Faculty of Information and Communication Technologies
- Source
- Future Generation Computer Systems, Vol. 26, no. 8 (Oct 2010), pp. 1200-1214
- Publication year
- 2010
- FOR Code(s)
- 0805 Distributed Computing
- Keyword(s)
- Cloud computing; Data management; Scientific workflow
- Publisher
- Elsevier
- ISSN
- 0167-739X
- Publisher URL
- http://dx.doi.org/10.1016/j.future.2010.02.004
- Copyright
- Copyright © 2010 Elsevier B.V. The accepted manuscript is reproduced in accordance with the copyright policy of the publisher.
- Research Projects
-
Novel cloud computing based workflow technology for managing large numbers of process instances, Australian Research Council grant number LP0990393
- Full text

- Peer reviewed



