Stochastic Models for Fault Tolerance

As modern society relies on the fault-free operation of complex computing systems, system fault-tolerance has become an indispensable requirement. Therefore, we need mechanisms that guarantee correct service in cases where system components fail, be they software or hardware elements. Redundancy patterns are commonly used, for either redundancy in space or redundancy in time.

Wolter's book details methods of redundancy in time that need to be issued at the right moment. In particular, she addresses the so-called 'timeout selection problem', i.e., the question of choosing the right time for different fault-tolerance mechanisms like restart, rejuvenation and checkpointing. Restart indicates the pure system restart, rejuvenation denotes the restart of the operating environment of a task, and checkpointing includes saving the system state periodically and reinitializing the system at the most recent checkpoint upon failure of the system. Her presentation includes a brief introduction to the methods, their detailed stochastic description, and also aspects of their efficient implementation in real-world systems.

The book is targeted at researchers and graduate students in system dependability, stochastic modeling and software reliability. Readers will find here an up-to-date overview of the key theoretical results, making this the only comprehensive text on stochastic models for restart-related problems.



Katinka Wolter is an assistant professor at Humboldt-University, Berlin, Germany, working with the on Computer Architecture and Communication Group since April 2002. She is principal investigator of two research projects funded by the German research council and teaches courses on performance analysis of communication systems and dependability evaluation. Prior to her current position, she was a visiting researcher at Hewlett-Packard Labs in Palo Alto, CA, USA. Her research interests include dependability evaluation of service-oriented architectures and wireless computer networks, as well as stochastic models for representing data in those systems and stochastic models for improving dependability through restart-based techniques.

Verwandte Artikel

Stochastic Models for Fault Tolerance Wolter, Katinka

53,49 €*

Weitere Produkte vom selben Autor

Download
PDF
Resilience Assessment and Evaluation of Computing Systems Katinka Wolter, Alberto Avritzer, Marco Vieira, Aad van Moorsel

117,69 €*