Analytical modelling and simulation of small scale, typical and highly available Beowulf clusters with breakdowns and repairs


EVER E., Gemikonakli O., Chakka R.

Simulation Modelling Practice and Theory, vol.17, no.2, pp.327-347, 2009 (SCI-Expanded, Scopus) identifier

  • Publication Type: Article / Article
  • Volume: 17 Issue: 2
  • Publication Date: 2009
  • Doi Number: 10.1016/j.simpat.2008.08.016
  • Journal Name: Simulation Modelling Practice and Theory
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Page Numbers: pp.327-347
  • Keywords: Clustering, High performance computing, Markov processes, Performability, Queuing theory
  • Middle East Technical University Northern Cyprus Campus Affiliated: Yes

Abstract

Beowulf clusters are very popular because of the high computational power they can provide at reasonably low costs. However, the most pressing issues of today's cluster solutions are the need for high availability and performance. Cluster systems are clearly prone to failures. Even if cover is provided with some probability c, there would be reconfiguration and/or rebooting delays to resume the operation following a failure. In this paper, the performability modelling of both typical and highly available Beowulf multiprocessor systems is presented. The models developed provide a large degree of flexibility to evaluate the performability of typical and highly available Beowulf cluster systems. © 2008 Elsevier B.V. All rights reserved.