Boundary Clock Nets - 2021.2 English

Versal ACAP Hardware, IP, and Platform Development Methodology Guide (UG1387)

Document ID

UG1387

Release Date

2021-11-19

Version

2021.2 English

After the first implementation, boundary clock net tracks are locked. The partition pin location constraints (PPLOCs) on the boundary clock nets are distributed to all clock regions covered by the reconfigurable partition (RP) Pblock.

The clock root of the boundary clock net can be placed anywhere in the device, because the boundary clock net can drive both static and RP loads. Xilinx recommends using the USER_CLOCK_ROOT constraint on the boundary clock net to manually constrain the CLOCK_ROOT location due to the following:

If the loads of the boundary clock are located mainly in the static region, the clock root might be placed in the static region.
If the first implementation uses training logic in the RP Pblock, boundary clock nets might be locked down after the first implementation with an off-center clock root location.
Because the boundary clock net is distributed to all clock regions covered by the RP Pblock, the clock insertion delay for the boundary clock is relatively high compared with the internal RM clock nets.

Following is an example of a suboptimal use of boundary clock nets. The schematic shows that a boundary clock from the static region is driving a register in the RP.

Figure 1. Boundary Clock from Static Region to RP

The following timing report shows the delay from the BUFGCE to the flip-flop is almost 7 ns. The high net delay for the boundary clock is due to the reconfigurable Pblock, which spans multiple super logic regions (SLRs). The boundary clock deposits a node in all of the clock regions covered by the Pblock.

Figure 2. Timing Report with High Net Delay for Boundary Clocks

The following figure shows the clock driver in magenta and the boundary clock net in yellow, which spans all clock regions covered by the RP Pblock.

Figure 3. Device View with Highlighted Boundary Clock Net

For the preceding example, you can use the following methods to avoid suboptimal use of boundary clock nets:

Eliminate the boundary clock net for the CE register, and move the BUFGCE that drives the register into the RM. The CE register clock is turned into an RM clock that does not span the entire RM clock region, which reduces insertion delay.
Set the CE_TYPE property to HARDSYNC on the gated BUFGCE. This disables the CE timing arc and enables an internal synchronizer to synchronize the CE signal to the BUFGCE input clock. For more information on this property, see the Versal ACAP Clocking Resources Architecture Manual (AM003).
Important: Because it takes three to four clock cycles to synchronize the CE signal, you must ensure that the design can handle this additional latency caused by enabling the BUFGCE-driven clock.
Reduce the RP Pblock size to reduce the clock insertion delay for the CE register clock.