As you code your design, be aware of the logic being inferred. Monitor the following conditions for additional pipelining considerations:
- Cones of logic with large fanin
For example, code that requires large buses or several combinational signals to compute an output.
- Blocks with restricted placement or slow clock-to-out or large setup
For example, block RAMs without output registers or arithmetic code that is not appropriately pipelined.
- Forced placement that causes long routes
For example, a pinout that forces a route across the chip might require pipelining to allow for high-speed operation.
- Logic comprised of large XOR functions
Large XOR functions often have high switch rates that can generate large dynamic power dissipation. Pipelining these functions can reduce switching, which positively impacts power consumption of the described circuit.
In the following figure the clock speed is limited by the following:
- Clock-to out-time of the source flip-flop
- Logic delay through four levels of logic
- Routing associated with the four function generators
- Setup time of the destination register
Use one of the following methods to ensure that your design uses pipeline registers correctly:
- In your RTL code, add the registers before or after the logic to be retimed, preferably within the hierarchy.
- Use the Vivado synthesis global retiming or BLOCK_SYNTH.RETIMING option, which analyzes the timing of a path and moves the registers to improve timing, if possible.
- Alternatively, for more control, use the
retiming_backwardsynthesis attributes. You can add these attributes on specific registers to force the tool to retime through combinational logic regardless of the timing score of the logic. For more information on these attributes, see the Vivado Design Suite User Guide: Synthesis (UG901).
The following figure shows the pipelining after adding extra registers.
The following figure is an example of the same data path shown in the Before Pipelining diagram. Because the flip-flop is contained in the same slice as the function generator, the clock speed is limited by the clock-to-out time of the source flip-flop, the logic delay through one level of logic, one routing delay, and the setup time of the destination register. In this example, the system clock runs faster after pipelining and retiming than in the original design.
Following is a code example that shows how to use the retiming attributes to force the specific retiming shown in the Pipelining After Retiming figure.
(* retiming_backward = 3 *) reg reg1; (* retiming_backward = 2 *) reg reg2; (* retiming_backward = 1 *) reg reg3;