Overview - 1.0 English

MicroBlaze Triple Modular Redundancy (TMR) Subsystem (PG268)

Document ID
PG268
Release Date
2022-04-28
Version
1.0 English

An increasing number of applications today need a high-reliability processing function to address dependability, safety and security requirements. One such application is use in high-radiation environments, where integrated circuits almost certainly will experience radiation-induced single-event effects.

The Xilinx ® Triple Modular Redundancy (TMR) solution for the Vivado® Design Suite is designed for these applications, providing all the necessary building blocks to implement a redundant triplicated MicroBlaze™ subsystem. This processing subsystem is fault tolerant and continues to operate nominally after encountering an error. Together with the capability to detect and recover from errors, the implementation ensures the reliability of the entire subsystem.

In nominal operation all three redundant blocks are working correctly, and outputs are majority voted. When an error is detected in one of the blocks the subsystem enters lockstep mode and keeps operating. Should a second error occur before recovery, it is detected and the subsystem halts with a fatal error.

An essential component of the implementation is the Vivado IP integrator automation, to greatly simplify the creation of a TMR subsystem, which otherwise can be both time consuming and error prone.

Major advantages of the solution are that redundant sub-blocks can be physically separated, that it provides the ability to test error detection, and that it enables rapid software controlled error recovery.

In addition to Triple Modular Redundancy the solution also supports dual lockstep mode, which provides fault detection for applications where fault tolerance is not required. The lockstep implementation can optionally be configured with a temporal delay, to improve detection of certain common mode errors such as clock glitches or voltage spikes.