AUTHORS: Kun Ma, Qiuchen Cheng, Bo Yang
Download as PDF
ABSTRACT: Live data migration in the cloud is responsible to migrate blocks of data of emigration node to several immigration node. However, live data migration strategy is a NP-hard problem like task scheduling. Recently, in-stream processing is the immediate need in many practical applications. Therefore, we explore a real-time live data migration strategy with stream processing framework in this paper. First, the migration cost and balance model is introduced as the metrics to evaluate data migration strategy. Subsequently, a live data migration strategy with particle swarm optimization is proposed. Afterwards, we implement this method using stream processing framework. The experimental results show the best performance of our method in all
KEYWORDS: load balancing, stream processing, data migration, particle swarm optimizationREFERENCES:
 Michael Armbrust, Armando Fox, Rean Griffith, Anthony D Joseph, Randy Katz, Andy Konwinski, Gunho Lee, David Patterson, Ariel Rabkin, Ion Stoica, et al. A view of cloud computing. Communications of the ACM, 53(4):50–58, 2010.
 Paul C Brebner. Is your cloud elastic enough?: performance modelling the elasticity of infrastructure as a service (iaas) cloud applications. In Proceedings of the 3rd ACM/SPEC International Conference on Performance Engineering, pages 263–266. ACM, 2012.
 Sudipto Das, Shoji Nishimura, Divyakant Agrawal, and Amr El Abbadi. Albatross: lightweight elasticity in shared storage databases for the cloud using live data migration. Proceedings of the VLDB Endowment, 4(8):494– 505, 2011.
 Maryam Razavian and Patricia Lago. A lean and mean strategy: a data migration industrial study. Journal of Software: Evolution and Process, 26(2):141–171, 2014.
 Xiulei Qin, Wenbo Zhang, Wei Wang, Jun Wei, Xin Zhao, and Tao Huang. Towards a costaware data migration approach for key-value stores. In Cluster Computing (CLUSTER), 2012 IEEE International Conference on, pages 551– 556. IEEE, 2012.
 Daniel Kunkle and Jiri Schindler. A load balancing framework for clustered storage systems. In High Performance Computing-HiPC 2008, pages 57–72. Springer, 2008.
 Flavio Pfaffhauser. Scaling a cloud storage system autonomously. PhD thesis, Masters thesis, ETH Zuerich, 2010.
 Gae-won You, Seung-won Hwang, and Navendu Jain. Scalable load balancing in cluster storage systems. In Proceedings of the 12th International Middleware Conference, pages 100–119. International Federation for Information Processing, 2011.
 Xiaojun Ren, Yongqing Zheng, and Lanju Kong. Multi-tenant data dynamic migration strategy of saas application in the cloud. Computer Engineering and Science, 35(10):89–97, 2013.
 Lanju Kong, Qingzhong Li, and Xiaona Li. A multi-tenant data migration policy for saas delivery platform. Computer Applications and Software, 28(11):52–56, 2011.
 Gerhard J Woeginger. Exact algorithms for nphard problems: A survey. In Combinatorial OptimizationłEureka, You Shrink!, pages 185–207. Springer, 2003.
 Rania Hassan, Babak Cohanim, Olivier De Weck, and Gerhard Venter. A comparison of particle swarm optimization and the genetic algorithm. In Proceedings of the 1st AIAA multidisciplinary design optimization specialist conference, pages 1–13, 2005.
 Carlos A Souza Lima, Celso Marcelo F Lapa, Claudio M ´ arcio do NA Pereira, Jo ´ ao J da Cun- ˜ ha, and Antonio Carlos M Alvim. Comparison of computational performance of ga and pso optimization techniques when designing similar systems–typical pwr core case. Annals of Nuclear Energy, 38(6):1339–1346, 2011.
 Ayed Salman, Imtiaz Ahmad, and Sabah AlMadani. Particle swarm optimization for task assignment problem. Microprocessors and Microsystems, 26(8):363–371, 2002.
 Zhanghui Liu and Xiaoli Wang. A pso-based algorithm for load balancing in virtual machines of cloud computing environment. In Advances in Swarm Intelligence, pages 142–147. Springer, 2012.
 Ashkan Paya and Dan Marinescu. Energy-aware load balancing and application scaling for the cloud ecosystem. IEEE Transactions on Cloud Computing, 2015.
 Fahimeh Ramezani, Jie Lu, and Farookh Khadeer Hussain. Task-based system load balancing in cloud computing using particle swarm optimization. International Journal of Parallel Programming, 42(5):739–754, 2014.
 Andrew W Mcnabb, Christopher K Monson, and Kevin D Seppi. Mrpso: Mapreduce particle swarm optimization. In Proceedings of the 9th annual conference on Genetic and evolutionary computation, pages 177–177. ACM, 2007.
 Ibrahim Aljarah and Simone Ludwig. Parallel particle swarm optimization clustering algorithm based on mapreduce methodology. In Nature and Biologically Inspired Computing (NaBIC), 2012 Fourth World Congress on, pages 104–111. IEEE, 2012.
 Andrew W McNabb, Christopher K Monson, and Kevin D Seppi. Parallel pso using mapreduce. In Evolutionary Computation, 2007. CEC 2007. IEEE Congress on, pages 7–14. IEEE, 2007.
 Leonardo Neumeyer, Bruce Robbins, Anish Nair, and Anand Kesari. S4: Distributed stream computing platform. In Data Mining Workshops (ICDMW), 2010 IEEE International Conference on, pages 170–177. IEEE, 2010.
 Ankit Toshniwal, Siddarth Taneja, Amit Shukla, Karthik Ramasamy, Jignesh M Patel, Sanjeev Kulkarni, Jason Jackson, Krishna Gade, Maosong Fu, Jake Donham, et al. Storm@ twitter. In Proceedings of the 2014 ACM SIGMOD international conference on Management of data, pages 147–156. ACM, 2014.
 Kejiang Ye, Xiaohong Jiang, Dawei Huang, Jianhai Chen, and Bei Wang. Live migration of multiple virtual machines with resource reservation in cloud computing environments. In Cloud Computing (CLOUD), 2011 IEEE International Conference on, pages 267–274. IEEE, 2011.
 Jeffrey Dean and Sanjay Ghemawat. Mapreduce: simplified data processing on large clusters. Communications of the ACM, 51(1):107– 113, 2008.
 David Karger, Eric Lehman, Tom Leighton, Rina Panigrahy, Matthew Levine, and Daniel Lewin. Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the world wide web. In Proceedings of the twenty-ninth annual ACM symposium on Theory of computing, pages 654–663. ACM, 1997.
 Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall, and Werner Vogels. Dynamo: amazon’s highly available key-value store. In ACM SIGOPS Operating Systems Review, volume 41, pages 205–220. ACM, 2007.
 Haikun Liu, Hai Jin, Cheng-Zhong Xu, and Xiaofei Liao. Performance and energy modeling for live migration of virtual machines. Cluster computing, 16(2):249–264, 2013.
 Beth Trushkowsky, Peter Bod´ık, Armando Fox, Michael J Franklin, Michael I Jordan, and David A Patterson. The scads director: Scaling a distributed storage system under stringent performance requirements. In FAST, pages 163– 176, 2011.
 Seyed Ebrahim Dashti and Amir Masoud Rahmani. Dynamic vms placement for energy efficiency by pso in cloud computing. Journal of Experimental & Theoretical Artificial Intelligence, (ahead-of-print):1–16, 2015.
 Zhen Xiao, Weijia Song, and Qi Chen. Dynamic resource allocation using virtual machines for cloud computing environment. Parallel and Distributed Systems, IEEE Transactions on, 24(6):1107–1117, 2013.
 Hector M Lugo-Cordero, Abigail FuentesRivera, Ratan K Guha, Eduardo Ortiz-Rivera, et al. Particle swarm optimization for load balancing in green smart homes. In Evolutionary Computation (CEC), 2011 IEEE Congress on, pages 715–720. IEEE, 2011.
 Ibrahim Aljarah and Simone A Ludwig. Towards a scalable intrusion detection system based on parallel pso clustering using mapreduce. In Proceedings of the 15th annual conference companion on Genetic and evolutionary computation, pages 169–170. ACM, 2013.
 Simone A Ludwig. Mapreduce-based optimization of overlay networks using particle swarm optimization. In Proceedings of the 2014 conference on Genetic and evolutionary computation, pages 1031–1038. ACM, 2014.
 Jianhua Lin. Divergence measures based on the shannon entropy. Information Theory, IEEE Transactions on, 37(1):145–151, 1991.