CFP last date
15 April 2024
Reseach Article

Comparative Study Load Balance Algorithms for Map Reduce Environment

by Hesham A. Hefny, Mohamed Helmy Khafagy, Ahmed M Wahdan
International Journal of Applied Information Systems
Foundation of Computer Science (FCS), NY, USA
Volume 7 - Number 11
Year of Publication: 2014
Authors: Hesham A. Hefny, Mohamed Helmy Khafagy, Ahmed M Wahdan
10.5120/ijais14-451261

Hesham A. Hefny, Mohamed Helmy Khafagy, Ahmed M Wahdan . Comparative Study Load Balance Algorithms for Map Reduce Environment. International Journal of Applied Information Systems. 7, 11 ( November 2014), 41-50. DOI=10.5120/ijais14-451261

@article{ 10.5120/ijais14-451261,
author = { Hesham A. Hefny, Mohamed Helmy Khafagy, Ahmed M Wahdan },
title = { Comparative Study Load Balance Algorithms for Map Reduce Environment },
journal = { International Journal of Applied Information Systems },
issue_date = { November 2014 },
volume = { 7 },
number = { 11 },
month = { November },
year = { 2014 },
issn = { 2249-0868 },
pages = { 41-50 },
numpages = {9},
url = { https://www.ijais.org/archives/volume7/number11/699-1261/ },
doi = { 10.5120/ijais14-451261 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2023-07-05T18:55:54.284688+05:30
%A Hesham A. Hefny
%A Mohamed Helmy Khafagy
%A Ahmed M Wahdan
%T Comparative Study Load Balance Algorithms for Map Reduce Environment
%J International Journal of Applied Information Systems
%@ 2249-0868
%V 7
%N 11
%P 41-50
%D 2014
%I Foundation of Computer Science (FCS), NY, USA
Abstract

MapReduce is a famous model for data-intensive parallel com-puting in shared-nothing clusters. One of the main issues in MapReduce is the fact of depending its performance mainly on data distribution. MapReduce contains simple load balance technique based on FIFO job scheduler that serves the jobs in their submission order but unfortunately it is insufficient in real world cases as it missed many factors that impact the perfor-mance such as heterogeneity factor and data skewness, so Load balancing is important to make all resources utilized evenly and more efficiently. There are two main schemes in load balancing a- Static Load Balancing Schemes b- Dynamic load balancing. The main aim of this work is to study and compare existing Load Balance algorithms also to illustrate the features of Load Balance algorithms.

References
  1. A. Alexandrov, S. Ewen, M. Heimel, F. Hueske, O. Kao, V. Markl, et al., "MapReduce and PACT - Comparing Data Parallel Programming Models," 2010.
  2. R. Vernica, M. J. Carey, and C. Li, "Efficient parallel set-similarity joins using MapReduce," Proceedings of the 2010 international conference on Management of data, 2010.
  3. J. Cohen, "Graph Twiddling in a MapReduce World," 2009.
  4. "http://lucene.apache.org/mahout/."
  5. L. Kolb, A. Thor, and E. Rahm, "Load Balancing for MapReduce-based Entity Resolution."
  6. "hadoop", http://hadoop.apache.org."
  7. Ebada Sarhan, Atif Ghalwash,Mohamed Khafagy ,Queue Weighting Load-Balancing Technique for Database Replication in Dynamic Content Web Sites ",APPLIED COMPUTER SCIENCE (ACS'09) University of Genova, Genova, Italy, 2009, Pages 50-55
  8. R. Mishra and A. Jaiswal, "Ant colony Optimization: A Solution of Load balancing in Cloud," International journal of Web & Semantic Technology, vol. 3, pp. 33-50, 2012.
  9. Z. Sui and S. Pallickara, "A Survey of Load Balancing Techniques for Data Intensive Computing," 2011.
  10. S. Shivle, R. Castain, H. J. Siegel, A. A. Maciejewski, T. Banka, K. Chindam, et al., "Static mapping of subtasks in a heterogeneous ad hoc grid environment," Proc. of 13th HCW Workshop, IEEE Computer Society, 2004.
  11. Ebada Sarhan, Atif Ghalwash,Mohamed Khafagy,Agent-Based Replication for Scaling Back-end Databases of Dynamic Content Web Sites”,ICCOMP'08 Proceedings of the 12th WSEAS international conference on Computers WSEAS,GREECE 2008 Pages 857-862
  12. S. V. Valvåg, "Cogset: A High-Performance MapReduce Engine," 2011.
  13. D. Escalante and A. J. Korty, "Cloud Services: Policy and Assessment," EDUCAUSE Review, vol. 46, 2011.
  14. R. Vernica, A. Balmin, K. S. Beyer, and V. Ercegovac, "Adaptive MapReduce using Situation-Aware Mappers," 2012.
  15. R. Baxter, P. Christen, and T. Churches., "A comparison of fast blocking methods for record linkage," Workshop Data Cleaning, Record Linkage, and Object Consolidation, 2003.
  16. H. K¨opcke, A. Thor, and E. Rahm, "Evaluation of entity resolution approaches on real-world match problems," PVLDB, vol. 3, 2010.
  17. L. Kolb, A. Thor, and E. Rahm, "Block-based Load Balancing for Entity Resolution with MapReduce," 2011.
  18. D. J. DeWitt, J. F. Naughton, D. A. Schneider, and S. Seshadri, "Practical Skew Handling in Parallel Joins," 1992.
  19. J. W. Stamos and H. C. Young, "A Symmetric Fragment and Replicate Algorithm for Distributed Joins,," IEEE TPDS, vol. 4, 1993.
  20. W. P. Yan and P.-A. Larson, "Eager Aggregation and Lazy Aggregation," vldb, 1995.
  21. B. Gufler, N. Augsten, A. Reiser, and A. Kemper, "Load Balancing in MapReduce Based on Scalable Cardinality Estimates."
  22. Ebada Sarhan, Atif Ghalwash, Mohamed Khafagy, Queue weighting load-balancing technique for database replication in dynamic content web sites, Proceedings of the 9th WSEAS International Conference on APPLIED COMPUTER SCIENCE 2009
  23. Khafagy, M.H. ; Feel, H.T.A.,Distributed Ontology Cloud Storage System” IEEE,Proceedings of the 2012 Second Symposium on Network Cloud Computing and Applications Pages48-52
  24. al Feel, H.T. ; Khafagy, M.H.OCSS: Ontology Cloud Storage System”,IEEE Network Cloud Computing and Applications (NCCA), 2011 First International Symposium on Pages 9-13
  25. Haytham Al Feel, Mohamed Khafagy, Search content via Cloud Storage System. International Journal of Computer Science Issues (IJCSI)bVolume 8 Issue 6, 2011
  26. J. Xie, S. Yin, X. Ruan, Z. Ding, Y. Tian, J. Majors, et al., "Improving MapReduce Performance through Data Placement in Heterogeneous Hadoop Clusters," 2010.
  27. K. A. Venkatesh, K. Neelamegam, and R. Revathy, "Using MapReduce and load balancing on the cloud Hadoop MapReduce and virtualization improves node performance," 2010.
  28. J. Polo, D. Carrera, Y. Becerra, M. Steinder, and I. Whalley, "Performance-driven task co-scheduling for MapReduce environments," Network Operations and Management Symposium IEEE, pp. 373-380, 2010.
  29. J. a. Polo, C. Castillo, D. Carrera, Y. Becerra, I. Whalley, M. Steinder, et al., "Resource-Aware Adaptive Scheduling for MapReduce Clusters," 2011.
  30. S. C. Racha, "Load Balancing Map-Reduce Communications for Ecient Executions of Applications in a Cloud," 2012.
  31. G. T. Lakshmanan and R. Strom., "Biologically-inspired distributed middleware management for stream processing systems," ACM Middleware conference, 2008.
  32. Y.-L. Su, P.-C. Chen, J.-B. Chang, and C.-K. Shieh, "Variable-sized map and locality-aware reduce on public-resource grids," Future Generation Computer Systems, vol. 27, pp. 843-849, 2011.
  33. J. Dean and S. Ghemawat, "MapReduce: Simplified Data Processing on Large Clusters.," CACM, 2008.
  34. B. Gufler, N. Augsten, A. Reiser, and A. Kemper, "The Partition Cost Model for Load Balancing in MapReduce," 2012.
  35. Bharadwaj, R. V., T.G., and D. Ghose, "Scheduling Divisible Loads in Parallel and Distributed Systems," IEEE Computer Society Press, Los Alamitos, 1996.
  36. C. Rosas, A. Sikora, J. Jorba, A. Moreno, and E. César, "Improving Performance on Data-Intensive Applications Using a Load Balancing Methodology Based on Divisible Load Theory," International Journal of Parallel Programming, vol. 42, pp. 94-118, 2012.
  37. Ebada Sarhan, Atif Ghalwash, Mohamed Khafagy, Specification and implementation of dynamic web site benchmark in telecommunication area, Proceedings of the 12th WSEAS international conference on Computers 2008 Pages 863-86
  38. www.tpc.org
Index Terms

Computer Science
Information Sciences

Keywords

Static Load Balance Map reduce Dynamic Load Balance static load balance comparative study