Reducing Control Sets - 2023.2 English

UltraFast Design Methodology Guide for FPGAs and SoCs (UG949)

Document ID
UG949
Release Date
2023-11-29
Version
2023.2 English
Note: This optimization technique is automatically applied by the report_qor_suggestions Tcl command.

Often not much consideration is given to control signals such as resets or clock enables. Many designers start HDL coding with "if reset" statements without deciding whether the reset is needed or not. While all registers support resets and clock enables, their use can significantly affect the end implementation in terms of maximum clock frequency, utilization, and power.

The first factor to consider is the number of control sets. A control set is the group of clock, enable, and set/reset signals used by a sequential cell. For example, two cells connected to the same clock have different control sets if only one cell has a reset or if only one cell has a clock enable. Constant or unused enable and set/reset register pins also contribute to forming control sets.

The second factor to consider is the targeted architecture. The number of control sets that can be packed together depends on the architecture:

  • A 7 series device slice (or half-CLB) comprises eight registers, which all share one clock, one set/reset, and one clock enable. Only one control set can be used per group of eight registers.
  • An UltraScale device half-CLB comprises two groups of four registers, which share one clock and one set/reset. In addition, each group of four registers has one clock enable and can ignore the set/reset. A constant set/reset signal is not routed and can be ignored. A constant enable signal is treated like a dynamic enable signal and needs to be routed. Under optimal conditions, up to two control sets can be used per group of eight registers.

CLB packing restrictions caused by control sets force the placer to move some registers, including their input LUT. In some cases, the registers are moved to less optimal locations. The additional distance can negatively impact not only utilization but also placement QoR and power consumption, due to logic spreading (longer net delays) and higher interconnect resources utilization. This is mainly of concern in designs with many low fanout control signals, such as clock enables that feed single registers.

Despite the higher UltraScale device CLB control set capacity, typical designs show a control set utilization similar to 7 series designs. Therefore, AMD recommendations are the same for both architectures.