Skip to main content

On Cloud computational models and the heterogeneity challenge

Abstract

Cloud computing is by far the most cost-effective technology for hosting Internet-scale services and applications. The MapReduce model, in particular, is largely used nowadays in Cloud infrastructures to meet the demand of large-scale data and computation intensive applications. Despite its success, the implications of MapReduce on the management of Cloud workload and cluster resources are still largely unstudied. In this article, we show that dealing with the heterogeneity of workloads and machine capabilities is a key challenge. In today’s cloud environment, workloads can have varied sizes, lengths, resource requirements, and arrival rates. The machines also have varied CPU, memory, I/O speed, and network bandwidth capacities. Jointly they pose difficult challenges pertaining, among others, to job scheduling, task and data placement, resource sharing and resource allocation. We analyze the heterogeneity challenge in these specific problem domains and survey the representative state-of-the-art works that try to address them. We found that although advances are made that partially address some of the outlined challenges, there are even more open challenges yet to be explored, and this topic at large is ripe for scientific contributions.

References

  1. 1.

    Amazon EC2, http://aws.amazon.com/ec2/

  2. 2.

    Ananthanarayanan G, Kandula S, Greenberg A, Stoica I, Lu Y, Saha B, Harris E (2010) Reining in the outliers in MapReduce clusters using Mantri. In: Proc. OSDI

    Google Scholar 

  3. 3.

    Ananthanarayanan G, Agarwal S, Kandula S, Greenberg A, Stoica I, Harlan D, Harris E (2011) Scarlett: coping with skewed content popularity in mapreduce clusters. ACM European conference on computing systems (EuroSys)

    Google Scholar 

  4. 4.

    Apache hadoop, http://hadoop.apache.org/

  5. 5.

    Chen Y, Ganapathi AS, Griffith R, Katz RH (2010) Towards understanding cloud performance tradeoffs using statistical workload analysis and replay. Tech rep, University of California, Berkeley

  6. 6.

    Chen Y, Ganapathi AS, Griffith R, Katz RH (2010) Analysis and lessons from a publicly available Google cluster trace. Tech rep, University of California, Berkeley

  7. 7.

    Cheng L, Zhang Q, Boutaba R (2011) Mitigating the negative impact of preemption on heterogeneous MapReduce workloads. In: International conference on network and service management (CNSM)

    Google Scholar 

  8. 8.

    Chowdhury M, Zaharia M, Ma J, Jordan M, Stoica I (2011) Managing data transfers in computer clusters with orchestra. In: ACM SIGCOMM

    Google Scholar 

  9. 9.

    Dean J, Ghemawat S (2008) MapReduce: simplified data processing on large clusters. Commun ACM 51(1):107–113

    Article  Google Scholar 

  10. 10.

    Feng H, Misra V, Rubenstein D (2005) Optimal state-free, size-aware dispatching for heterogeneous M/G-type systems. In: Performance evaluation

    Google Scholar 

  11. 11.

    Foss S, Korshunov D (2006) Heavy tails in multi-server queue. In: Queueing systems: theory and applications

    Google Scholar 

  12. 12.

    Ghemawat S, Gobioff H, Leung ST (2003) The Google file system. ACM SIGOPS Oper Syst Rev 37(5):29–43

    Article  Google Scholar 

  13. 13.

    Ghodsi A, Zaharia M, Hindman B, Konwinski A, Shenker S, Stoica I (2011) Dominant resource fairness: fair allocation of multiple resource types. In: Networked systems design implementation (NSDI), pp 323–336

    Google Scholar 

  14. 14.

    Hadoop distributed file system, http://hadoop.apache.org/hdfs/

  15. 15.

    Harchol-Balter M (2002) Task assignment with unknown duration. J. ACM 49(2):260–288

    MathSciNet  Article  Google Scholar 

  16. 16.

    Harchol-Balter M, Downey AB (1997) Exploiting process lifetime distributions for dynamic load balancing. In: ACM transactions on computer systems

    Google Scholar 

  17. 17.

    Isard M, Budiu M, Yu Y, Birrell A, Fetterly D (2007) Dryad: distributed data-parallel programs from sequential building blocks. In: Proc. Eurosys, March 2007, pp 59–72

    Google Scholar 

  18. 18.

    Isard M, Prabhakaran V, Currey J, Wieder U, Talwar K, Goldberg A (2009) Quincy: fair scheduling for distributed computing clusters. In: Proc. SOSP

    Google Scholar 

  19. 19.

    Jackson DS, Kunzinger FF (2003) Calculation of system availability using traffic statistics. Bell Labs Tech J 7(3):139–150

    Article  Google Scholar 

  20. 20.

    Lempiäinen J, Manninen M (2002) Radio interface system planning for GSM/GPRS/UMTS. Springer, Berlin

    Google Scholar 

  21. 21.

    Mishra AK, Hellerstein JL, Cirne W, Das CR (2010) Towards characterizing Cloud backend workloads: insights from Google compute clusters. ACM SIGMETRICS Perform Eval Rev 37(4):34–41

    Article  Google Scholar 

  22. 22.

    NIST definition of cloud computing v15, http://csrc.nist.gov/groups/SNS/cloud-computing/cloud-def-v15.doc

  23. 23.

    Palanisamy B, Singh A, Liu L, Jain BP (2011) Locality-aware resource allocation for MapReduce in a cloud. In: ACM international conference on supercomputing (SC)

    Google Scholar 

  24. 24.

    Tari Z, Broberg J, Zomaya A, Baldoni R (2005) A least flow-time first load sharing approach for distributed server farm. J Parallel and Distributed Computing

  25. 25.

    Tian F, Chen K (2011) Towards optimal resource provisioning for running MapReduce programs in public clouds. In: IEEE international conference on cloud computing (CLOUD)

    Google Scholar 

  26. 26.

    Traffic analysis for voice over IP, Cisco Technical report, 2007

  27. 27.

    Zaharia M, Konwinski A, Joseph AD, Katz R, Stoica I (2008) Improving MapReduce performance in heterogeneous environments. In: Proc. OSDI

    Google Scholar 

  28. 28.

    Zaharia M, Borthakur D, Sarma JS, Elmeleegy K, Shenker S, Stoica I (2010) Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling. In: Proc. Eurosys

    Google Scholar 

  29. 29.

    Zhang Q, Cheng L, Boutaba R (2010) Cloud computing: state-of-the-art and research challenges. J Internet Serv Appl 1(1):7–18

    Article  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Raouf Boutaba.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Boutaba, R., Cheng, L. & Zhang, Q. On Cloud computational models and the heterogeneity challenge. J Internet Serv Appl 3, 77–86 (2012). https://doi.org/10.1007/s13174-011-0054-7

Download citation

Keywords

  • Cloud computing
  • MapReduce
  • Heterogeneity
  • Scheduling
  • Resource allocation