Skip to main content

R&D challenges and solutions for highly complex distributed systems: a middleware perspective


Highly complex distributed systems (HCDSs) are characterized by a large number of mission-critical, heterogeneous inter-dependent subsystems executing concurrently with diverse—often conflicting—quality-of-service (QoS) requirements. Creating, integrating, and assuring these properties in HCDSs is hard and expecting application developers to perform these activities without significant support is unrealistic. As a result, the computing and communication foundation for HCDSs is increasingly based on middleware. This article examines key R&D challenges that impede the ability of researchers and developers to manage HCDS software complexity. For each challenge that must be addressed to support HCDSs, the article surveys the state-of-the-art middleware solutions to these challenges and describes open issues and promising future research directions.


  1. 1.

    Rover D, Waheed A, Mutka M, Bakic A (1998) Software tools for complex distributed systems: toward integrated tool environments. IEEE Concurr 6(2):40–54

    Article  Google Scholar 

  2. 2.

    Schnorr L, Legrand A, Vincent J (2012, to appear) Detection and analysis of resource usage anomalies in large distributed systems through multi-scale visualization. In: Concurrency and computation: practice and experience. Wiley

  3. 3.

    Albrecht J, Braud R, Dao D, Topilski N, Tuttle C, Snoeren A, Vahdat A (2007) Remote control: distributed application configuration, management, and visualization with plush. In: Proceedings of the 21st conference on large installation system administration conference. USENIX Association, Berkeley, p 15

    Google Scholar 

  4. 4.

    Institute SE (2006) Ultra-large-scale systems: software challenge of the future. Tech rep, Carnegie Mellon University, Pittsburgh, PA, USA, June 2006

  5. 5.

    White J, Hill J, Eade S, Schmidt DC (2008) Towards a solution for synchronizing disparate models of ultra-large-scale systems. In: Proceedings of the ULSSIS workshop, Leipzig, Germany, May 2008

    Google Scholar 

  6. 6.

    Corsaro A (2010) The data distribution service for real-time systems. Dr Dobbs J

  7. 7.

    Hatcliff J (2009) An integrated specification and verification environment for component-based architectures of large-scale distributed systems. Tech rep, DTIC Document

  8. 8.

    White J, Dougherty B, Thompson C, Schmidt D (2011) ScatterD: spatial deployment optimization with hybrid heuristic/evolutionary algorithms. ACM Trans Auton Adapt Syst 6(3). Special Issue on Spat Comput

  9. 9.

    Yilmaz C, Porter A, Krishna A, Memon A, Schmidt D, Gokhale A, Natarajan B (2007) Reliable effects screening: a distributed continuous quality assurance process for monitoring performance degradation in evolving software systems. IEEE Trans Softw Eng 124–141

  10. 10.

    Fouché S, Cohen MB, Porter A (2009) Incremental covering array failure characterization in large configuration spaces. In: Proceedings of the eighteenth international symposium on software testing and analysis, ISSTA ’09, pp 177–188

    Google Scholar 

  11. 11.

    Westermann D, Happe J (2010) Towards performance prediction of large enterprise applications based on systematic measurements. In: Proceedings of the 15th international workshop on component-oriented programming (WCOP) 2010, pp 71–78

    Google Scholar 

  12. 12.

    Kappler T, Koziolek H, Krogmann K, Reussner RH (2008) Towards automatic construction of reusable prediction models for component-based performance engineering. Softw Eng 121:140–154

    Google Scholar 

  13. 13.

    Schantz R, Schmidt D (2008) Middleware for distributed systems. In: Wah B (ed) Encyclopedia of computer science and engineering. Wiley, New York

    Google Scholar 

  14. 14.

    Otte W, Gokhale A, Schmidt DC (2011) Predictable deployment in component-based enterprise distributed real-time and embedded systems. In: Proceedings of the 14th international ACM SIGSOFT symposium on component-based software engineering (CBSE), Boulder, CO, USA. ACM, New York

    Google Scholar 

  15. 15.

    Rohloff K, Gabay Y, Ye J, Schantz R (2007) Scalable, distributed, dynamic resource management for the ARMS distributed real-time embedded system. In: Parallel and distributed processing symposium, 2007, IPDPS 2007, IEEE International. IEEE Press, New York, pp 1–7

    Google Scholar 

  16. 16.

    Loyall J, Gillen M, Sinclair A, Carvalho M, Bunch L, Marcon M, Martignoni A (2009) Quality of service in US air force information management systems. In: Military communications conference, 2009. MILCOM 2009. IEEE Press, New York, pp 1–8

    Google Scholar 

  17. 17.

    Loyall J, Gillen M, Paulos A, Bunch L, Carvalho M, Edmondson J, Schmidt D, Martignoni A III, Sinclair A (2011) Dynamic policy-driven quality of service in service-oriented information management systems. In: Software: practice and experience

    Google Scholar 

  18. 18.

    Surajbali B, Grace P, Coulson G (2009) A semantic composition model to preserve (Re)configuration consistency in aspect oriented middleware. In: Proceedings of the 8th international workshop on adaptive and reflective middleware. ACM, New York, pp 1–6

    Google Scholar 

  19. 19.

    Otte W, Schmidt D, Gokhale A (2010) Towards an adaptive deployment and configuration framework for component-based distributed systems. In: Proceedings of the 9th workshop on adaptive and reflective middleware (ARM’10)

    Google Scholar 

  20. 20.

    Atighetchi M, Pal P (2009) From auto-adaptive to survivable and self-regenerative systems successes, challenges, and future. In: 8th IEEE international symposium on network computing and applications, 2009. NCA 2009. IEEE Press, New York, pp 98–101

    Google Scholar 

  21. 21.

    White J, Doughtery B, Schmidt D (2010) Ascent: an algorithmic technique for designing hardware and software in tandem. IEEE Trans Softw Eng 838–851

  22. 22.

    Aguilera M, Merchant A, Shah M, Veitch A, Karamanolis C (2007) Sinfonia: a new paradigm for building scalable distributed systems. In: Proceedings of 21st ACM SIGOPS symposium on operating systems principles. ACM, New York, pp 159–174

    Google Scholar 

  23. 23.

    Kramer J, Magee J (2007) Self-managed systems: an architectural challenge. In: ICSE 2007

    Google Scholar 

  24. 24.

    Grant R, Combs V, Hanna J, Lipa B, Reilly J (2009) Phoenix: SOA based information management services. In: Proceedings of SPIE, vol 7350, p 73500P

    Google Scholar 

  25. 25.

    Cleveland J, Loyall J, Webb J, Hanna J, Clark S (2011) VFILM: a value function driven approach to information lifecycle management. In: Society of photo-optical instrumentation engineers (SPIE) conference series, vol 8062, p 1

    Google Scholar 

  26. 26.

    Group OM (2010) Extensible and dynamic topic types for DDS. Specification version 1.0, Object Management Group, December 2010

  27. 27.

    Porter A, Yilmaz C, Memon A, Schmidt D, Natarajan B (2007) Skoll: a process and infrastructure for distributed continuous quality assurance. IEEE Trans Softw Eng 510–525

  28. 28.

    Yoon I, Sussman A, Memon A, Porter A (2007) Direct-dependency-based software compatibility testing. In: Proceedings of the 22nd IEEE/ACM international conference on automated software engineering. ACM, New York, pp 409–412

    Google Scholar 

  29. 29.

    Yoon I, Sussman A, Memon A, Porter A (2008) Effective and scalable software compatibility testing. In: Proceedings of the 2008 international symposium on software testing and analysis. ACM, New York, pp 63–74

    Google Scholar 

  30. 30.

    Yoon I, Sussman A, Memon A, Porter A (2011) Towards incremental component compatibility testing. In: Proceedings of the 14th international ACM Sigsoft symposium on component based software engineering, CBSE ’11, pp 119–128

    Google Scholar 

  31. 31.

    White L, Jaber K, Robinson B, Rajlich V (2008) Extended firewall for regression testing: an experience report. J Softw Maint Evol, Res Practice 20(6):419–433

    Article  Google Scholar 

  32. 32.

    Martens A, Koziolek H, Becker S, Reussner R (2010) Automatically improve software architecture models for performance, reliability, and cost using evolutionary algorithms. In: Proceedings of the 1st joint WOSP/SIPEW international conference on performance engineering. ACM, New York, pp 105–116

    Google Scholar 

  33. 33.

    Amyot D, Farah H, Roy J (2006) Evaluation of development tools for domain-specific modeling languages. Syst Anal Model Lang Profiles 183–197

  34. 34.

    Briest P, Krysta P, Vöcking B (2005) Approximation techniques for utilitarian mechanism design. In: Proceedings of the 37th annual ACM symposium on theory of computing. ACM, New York, pp 39–48

    Google Scholar 

  35. 35.

    Mu’Alem A, Nisan N (2008) Truthful approximation mechanisms for restricted combinatorial auctions. Games Econ Behav 64(2):612–631

    MathSciNet  Article  Google Scholar 

  36. 36.

    Liu J, Tsui K (2006) Toward nature-inspired computing. Commun ACM 49:59–64

    Article  Google Scholar 

  37. 37.

    Baldoni R, Corsaro A, Querzoni L, Scipioni S, Piergiovanni ST (2009) Coupling-based internal clock synchronization for large-scale dynamic distributed systems. IEEE Trans Parallel Distrib Syst 99(RapidPosts):607–619

    Google Scholar 

  38. 38.

    Sivanandam S, Deepa S (2007) Introduction to genetic algorithms. Springer, Berlin

    Google Scholar 

  39. 39.

    Poli R, Kennedy J, Blackwell T (2007) Particle swarm optimization. Swarm Intell 1(1):33–57

    Article  Google Scholar 

  40. 40.

    Dougherty B, White J, Balasubramanian J, Thompson C, Schmidt D (2009) Deployment automation with BLITZ. In: 31st international conference on software engineering, companion volume. IEEE Press, New York, pp 271–274

    Google Scholar 

  41. 41.

    Koziolek A, Noorshams Q, Reussner R (2011) Focussing multi-objective software architecture optimization using quality of service bounds. In: Models in software engineering, workshops and symposia at MODELS 2010, Oslo, Norway, October 3–8, 2010. Lecture notes in computer science, vol 6627. Springer, Berlin, pp 384–399. Reports and revised selected papers

    Google Scholar 

  42. 42.

    Buyya R, Yeo C, Venugopal S (2008) Market-oriented cloud computing: vision, hype, and reality for delivering IT services as computing utilities. In: The 10th IEEE international conference on high performance computing and communications. IEEE Press, New York, pp 5–13

    Google Scholar 

  43. 43.

    Ostermann S, Iosup A, Yigitbasi N, Prodan R, Fahringer T, Epema D (2010) A performance analysis of EC2 cloud computing services for scientific computing. Cloud Comput 115–131

Download references

Author information



Corresponding author

Correspondence to Douglas C. Schmidt.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

White, J., Dougherty, B., Schantz, R. et al. R&D challenges and solutions for highly complex distributed systems: a middleware perspective. J Internet Serv Appl 3, 5–13 (2012).

Download citation


  • Highly complex distributed systems
  • Middleware
  • Quality of service