Sunday 10 February 2013

Discuss the four aspects to fault tolerance

There are four aspects of fault tolerance:
        i.            Failure detection: The system must detect a particular state combination has resulted or will result in a system failure.
        ii.            Damage assessment: The parts of the system state, which have been affected by the failure, must be detected.
       iii.            Fault Recovery: The system must restore its state to a known ‘safe’ state. This may achieved by correcting by correcting the damaged or by restoring the system to a known ‘safe’ state.
     iv.     Fault repair: This involves modifying the system so that the fault does not recur. In many cases, software failures are transient and due to a peculiar combination of system inputs. No repair is necessary as normal processing can resume immediately after fault recovery. This is an important distinction between hardware and software faults.

No comments:

Post a Comment