Fault-tolerant is the ability of a computer system or component so that, in the event that a component fails, a backup component or procedure can immediately take its place with no loss of service. Fault tolerance can be provided with software, or embedded in hardware, or provided by some combination.
By submitting your personal information, you agree that TechTarget and its partners may contact you regarding relevant content, products and special offers.
In the software implementation, the operating system provides an interface that allows a programmer to "checkpoint" critical data at pre-determined points within a transaction. In the hardware implementation (for example, with Stratus and its VOS operating system), the programmer does not need to be aware of the fault-tolerant capabilities of the machine.
At a hardware level, fault tolerance is achieved by duplexing each hardware component. Disks are mirrored. Multiple processors are "lock-stepped" together and their outputs are compared for correctness. When an anomaly occurs, the faulty component is determined and taken out of service, but the machine continues to function as usual.