Skip to main content

Volume 3 Supplement 1

Special Issue on the Future of Middleware (FOME'11)

Challenges in very large distributed systems


Many modern distributed systems are required to scale in terms of their support for processes, resources, and users. Moreover, a system is often also required to operate across the Internet and across different administrative domains. These scalability requirements lead to a number of well-known challenges in which distribution transparency needs to be traded off against loss of performance. We concentrate on two major challenges for which we claim there is no easy solution. These challenges originate from the fact that users and system are becoming increasingly integrated and are effectively leading us to large-scale socio-technical distributed systems. We identify the design of such integrated systems as one challenge, in particular when it comes to placing humans in the loop as a necessity to proper operation of the system as a whole. As users are so tightly integrated into the overall design, and systems naturally expand through composition, we will be facing problems with respect to long-term management, which we identify as another major challenge.


  1. Amazon Web Services (2011) Summary of the Amazon EC2 and Amazon RDS service disruption in the US east region.

  2. Atzori L, Iera A, Morabito G (2010) The internet of things: a survey. Comput Netw 54(15):2787–2805

    Article  Google Scholar 

  3. Bergstra J, Burgess M (eds) (2007) Handbook of network and system administration. Elsevier, Amsterdam

    Google Scholar 

  4. Chen H, Jiang G, Zhang H, Yoshihira K (2010) A cooperative sampling approach to discovering optimal configurations in large scale computing systems. In: Proc. 29th IEEE international symposium on reliable distributed systems

    Google Scholar 

  5. Dey A (2010) Context-aware computing. In: Krumm J (ed) Ubiquitous computing fundamentals. CRC Press, Boca Raton, pp 321–352

    Google Scholar 

  6. Gilbert S, Lynch N (2002) Brewer’s conjecture and the feasibility of consistent, available, partition-tolerant web services. SIGACT News 33(2):51–59

    Article  Google Scholar 

  7. Huijsman R-J (Aug. 2011) An investigation into evolving distributed systems. Master’s thesis. VU University Amsterdam

  8. Jelasity M, Voulgaris S, Guerraoui R, Kermarrec A-M, van Steen M (2007) Gossip-based peer sampling. ACM Trans Comput Syst 25(3)

  9. Kreitz G, Niemelä F (2010) Spotify—large scale, low latency, P2P music-on-demand streaming. In: Proc. 10th international conference on peer-to-peer computing, Aug. 2010. IEEE Comput Soc, Los Alamitos, pp 266–275

    Google Scholar 

  10. Lewis TG (2009) Network science: theory and practice. Wiley, New York

    Book  Google Scholar 

  11. Madan A, Cebrian M, Moturu S, Farrahi K, Pentland A (2011) Sensing the ‘Health state’ of our society. Technical Report TR-663, MIT, Cambridge, MA

  12. Madden SR, Franklin MJ, Hellerstein JM, Hong W (2005) TinyDB: an acquisitional query processing system for sensor networks. ACM Trans Database Syst 30(1):122–173

    Article  Google Scholar 

  13. McSherry F, Mironov I (2009) Differentially private recommender systems: building privacy into the Netflix prize contenders. In: Proc. 15th international conference knowledge discovery and data mining (KDD), June 2009. ACM, New York, pp 627–637

    Chapter  Google Scholar 

  14. Moran S, Nakata K (2010) Ubiquitous monitoring and user behaviour: a preliminary model. J Ambient Intell Smart Environ 2(1):67–80

    Google Scholar 

  15. Mottola L, Picco GP (2011) Programming wireless sensor networks: fundamental concepts and state of the art. ACM Comput Surv 43(3):19

    Article  Google Scholar 

  16. Newman M (2010) Networks, an introduction. Oxford University Press, Oxford

    Book  Google Scholar 

  17. Ogston E, Bakker A, van Steen M (2006) On the value of random opinions in decentralized recommendation. In: Proc. 6th international conference on distributed applications and interoperable systems. Lecture notes in computer science, vol 4025, pp 84–98. Springer, Berlin

    Chapter  Google Scholar 

  18. Olguín DO, Pentland AS (2010) Sensor-based organisational design and engineering. Int J Organ Des Eng 1(1/2):69–97

    Google Scholar 

  19. Poslad S (2009) Ubiquitous computing: smart devices, environments and interactions. Wiley, New York

    Book  Google Scholar 

  20. Qin F, Tucek J, Sundaresan J, Zhou Y (2005) Rx: treating bugs as allergies—a safe method to survive software failures. In: Proc. 20th symposium on operating system principles, Oct. 2005. ACM, New York, pp 235–248

    Google Scholar 

  21. Ramakrishnan N, Keller BJ, Mirza BJ, Grama AY, Karypis G (2001) Privacy risks in recommender systems. IEEE Internet Comput 5:54–62

    Article  Google Scholar 

  22. Urdaneta G, Pierre G, van Steen M (2011) A survey of DHT security techniques. ACM Comput Surv 43(2)

  23. van Steen M, Pierre G (2010) Replicating for performance: case studies. In: Charron-Bost B, Pedone F, Schiper A (eds) Replication, theory and practice. Lecture notes in computer science, vol 5959. Springer, Berlin, pp 73–89. Chapter 5

    Google Scholar 

  24. Voulgaris S, van Steen M, Iwanicki K (2007) Proactive gossip-based management of semantic overlay networks. Concurr Comput 19(17):2299–2311

    Article  Google Scholar 

  25. Wams J, van Steen M (2004) Unifying user-to-user messaging systems. IEEE Internet Comput 8(2):76–82

    Article  Google Scholar 

  26. Weiser M (1991) The computer for the 21st century. Sci Am September:67–83

    Google Scholar 

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Maarten van Steen.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and Permissions

About this article

Cite this article

van Steen, M., Pierre, G. & Voulgaris, S. Challenges in very large distributed systems. J Internet Serv Appl 3, 59–66 (2012).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: