Software fault tolerance

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Ira Leviton (talk | contribs) at 20:38, 16 August 2018 (Deleted the phrase "it is obvious that" - see Wikipedia:Manual_of_Style/Words_to_watch#Editorializing.). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Software fault tolerance is the ability of computer software to continue its normal operation despite the presence of system or hardware faults.

Introduction

The only thing constant is change. This is certainly more true of software systems than almost any phenomenon,[1] not all software change in the same way so software fault tolerance methods are designed to overcome execution errors by modifying variable values to create an acceptable program state.[2] The need to control software fault is one of the most rising challenges facing software industries today. Fault tolerance must be a key consideration in the early stage of software development.

There exist different mechanisms for software fault tolerance, among which:

  • Recovery blocks
  • N-version software
  • Self-checking software

See also

References

  1. ^ Eckhardt, D. E., "Fundamental Differences in the Reliability of N-Modular Redundancy and N-Version Programming", The Journal of Systems and Software, 8, 1988, pp. 313–318.
  2. ^ Ray Giguette and Johnette Hassell, “Toward A Resourceful Method of Software Fault Tolerance”, ACM Southeast regional conference, April, 1999.

Further reading