Controlling Delays

Versal Adaptive SoC SelectIO Resources Architecture Manual (AM010)

Document ID
AM010
Release Date
2024-03-18
Revision
1.5 English
Input and output delays can be changed through the PL, as shown in the following table.
Table 1. Controlling Input and Output Delays
RX_EN_VTC TX_EN_VTC LD CE INC Effect on Delay Line
0 0 0 0 Stays the same
0 0 1 Stays the same
0 1 0 Decrement by 1 tap
0 1 1 Increment by 1 tap
1 0 0 Load value from CNTVALUEIN
1 0 1 Load value from CNTVALUEIN
1 1 0 Not supported
1 1 1 Add value from CNTVALUEIN to the current CNTVALUEOUT value

The following table shows how the delay-related signals are mapped to each NIBBLESLICE.

Table 2. Delay Control Signals NIBBLESLICE Mapping
Port NIBBLESLICE[5] NIBBLESLICE[4] NIBBLESLICE[3] NIBBLESLICE[2] NIBBLESLICE[1] NIBBLESLICE[0] Description
CE[5:0] CE[5] CE[4] CE[3] CE[2] CE[1] CE[0] Control signal
INC[5:0] INC[5] INC[4] INC[3] INC[2] INC[1] INC[0] Control signal
LD[5:0] LD[5] LD[4] LD[3] LD[2] LD[1] LD[0] Control signal
RXTX_SEL[5:0] RXTX_SEL[5] RXTX_SEL[4] RXTX_SEL[3] RXTX_SEL[2] RXTX_SEL[1] RXTX_SEL[0] Selects between the input and output delay line to apply delay line updates or report their tap-value through CNTVALUEOUT
CNTVALUEIN [53:0] CNTVALUEIN [53:45] CNTVALUEIN [44:36] CNTVALUEIN [35:27] CNTVALUEIN [26:18] CNTVALUEIN [17:9] CNTVALUEIN [8:0] Number of taps to be applied to the delay line
CNTVALUEOUT [53:0] CNTVALUEOUT [53:45] CNTVALUEOUT [44:36] CNTVALUEOUT [35:27] CNTVALUEOUT [26:18] CNTVALUEOUT [17:9] CNTVALUEOUT [8:0] Number of taps currently used by delay line
Table 3. Delay Attributes
Attribute Description
CASCADE_<0-5> Doubles the available delay in a NIBBLESLICE by cascading the input and output delays in the NIBBLESLICE. Only applicable for RX.
CRSE_DLY_EN Enables coarse delays
DELAY_VALUE_<0-5> Sets the initial delay for the input and output delays in a NIBBLESLICE. If CASCADE_x = TRUE, the max delay available in DELAY_VALUE_x doubles.

DELAY_VALUE_x sets the initial delay (in ps) of both the input and output delay in NIBBLESLICE[x]. While DELAY_VALUE_x is set in terms of time, the delay is ultimately applied to the delay lines in terms of taps. This makes DELAY_VALUE_x unique in that it is the only way to set a time value through an attribute for any of the delay lines.

When changing the value of input or output delays, consider the following:

  • DLY_RDY must be 1.
  • To avoid glitches, input and output delays can only be changed once every three CTRL_CLK cycles.
  • Input and output delay changes take effect one CTRL_CLK cycle after being reflected in CNTVALUEOUT.
  • Delays, regardless of the type, are always manifested in terms of taps.
  • If a nonzero DELAY_VALUE_x is set, the following equation can be used to estimate the number of taps required for a new time-based value for an input or output delay. Delay_old is the previous delay in terms of time (ps) whereas delay_new is the desired delay in terms of time (ps). If DELAY_VALUE_x were set to zero, this equation would not be valid. While CNTVALUEIN can still be used to load taps, the approximate time value of each tap will not be known and thus cannot be used to calculate a new time delay value. Align_delay is used by BISC to compensate for the internal skew between clock and data insertion delays of input paths to the first capture flip-flops. More information on align_delay can be found beneath the following waveforms and in Built-in Self-Calibration.
    CNTVALUEIN[NIBBLESLICE[x]] = delay_new * ((CNTVALUEOUT[NIBBLESLICE[x]] – align_delay)/delay_old)
  • Updating input and output delays through the register interface unit (RIU) takes one additional CTRL_CLK cycle compared to updating delays through the PL. When updating input or output delays through the RIU, RX_EN_VTC and TX_EN_VTC must be set to 1, LD must be set to 1, CE must be set to 0, and INC is a don't care. Note that only input and output delays can be updated through the PL.
  • If TBYTE_CTRL_# = PHY_WREN, the tristate NIBBESLICE is capable of applying a delay to the tristate signal. To change the amount of delay applied in the tristate NIBBLESLICE, use the TRISTATE_ODLY register within the RIU.
Important: Jumps greater than eight taps in output delays might result in data glitching. If a jump greater than eight taps in an output delay is desired, gate the TX datapath through PHY_WREN (if TX_GATING = ENABLE) or stop the XPLL.CLKOUTPHY that is connected to XPHY.PLL_CLK.
Important: If voltage and temperature compensation (VTC) is desired on an output delay, the input delay in the same NIBBLESLICE as the output delay must be set to the same tap value as the output delay.
Important: Bit alignment is not guaranteed on output delays if the delays are equal to or greater than 1.5 UI.
Important: For interfaces with REFCLK_FREQUENCY below 500 MHz, DELAY_VALUE_<0-5> and VTC are not supported.
Important: NIBBLESLICE[0] must be used in nibbles configured as TX-only for proper output delay calibration.

The following waveform shows how to update a single input or output delay. The waveform updates NIBBLESLICE[2] but applies to any NIBBLESLICE (note that the bus for each signal, other than CTRL_CLK, would change too).

Figure 1. Updating a Single Input or Output Delay Line
Start
The goal of this waveform is to update the input delay, implying RXTX_SEL[2] = 0, through CNTVALUEIN[26:18]. {LD[2], CE[2], INC[2]} = 10x tells the delay line to use CNTVALUEIN to update the delay line, which is the starting point of this waveform.
A
A three-cycle wait is required between updates to the same delay line (in this case, the input delay of NIBBLESLICE[2]) to prevent glitching.
B
The reported delay might not match the actual delay being applied. This only affects a delay line that is changed, so the other delay lines in the nibble are still reported accurately.
C
The three-cycle wait from A has been met, so from C onward the input delay (of NIBBLESLICE[2]) can be changed without glitching. Note that B overlaps with C for one cycle.
After C
The input delay (of NIBBLESLICE[2]) is now reported accurately, and the delay line update is considered complete.

The following waveform shows how to update cascaded delays. The waveform updates NIBBLESLICE[2], but applies to any NIBBLESLICE (note that the bus for each signal, other than CTRL_CLK, would change too).

Figure 2. Updating a Cascaded Delay Line
Start
The goal of this waveform is to update a cascaded delay line, which is composed of the input delay and output delay of a NIBBLESLICE. To update a cascaded delay line, update both the input and output delays that compose the cascaded delay. Each delay line should be loaded with half of the total desired delay (New delay/2). In this case, the input delay (RXTX_SEL[2] = 0) is updated first, followed by the output delay (RXTX_SEL[2] = 1). Like the waveform in Figure 1, this waveform updates the delay through CNTVALUEIN[26:18]. {LD[2], CE[2], INC[2]} = 10x tells the delay line to use CNTVALUEIN to update the delay line, which is the starting point of this waveform.
A (input delay/output delay)
A three-cycle wait is required between updates to the same delay line (in this case the <input/output> delay of NIBBLESLICE[2]) to prevent glitching.
B (input delay/output delay)
The reported <input/output> delay might not match the actual delay being applied. This only affects a delay line that is changed, so the other delay lines in the nibble are still reported accurately. Note, however, the three cycles shared between A (input delay) and B (output delay) where both the input and output delays of NIBBLESLICE[2] can report a delay that doesn’t match the actual delay being applied.
C (input delay/output delay)
The three-cycle wait from A has been met, so from C onward, the <input/output> delay (of NIBBLESLICE[2]) can be changed without glitching.
After C (input delay)
The input delay (of NIBBLESLICE[2]) is now reported accurately, and its delay update is complete. However, the output delay is not finished updating yet.
After C (output delay)
The input and output delays (of NIBBLESLICE[2]) are now reported accurately, and the cascaded delay line update is considered complete.

Align_delay, the result of the first step of BISC, can be estimated by setting DELAY_VALUE_x to zero. Reading CNTVALUEOUT of the input delay (RXTX_SEL[x] = 0) reports align_delay for NIBBLESLICE[x]. This can be used as an estimate of align_delay for other NIBBLESLICEs within the nibble. A few other points on align_delay:

  • It does not appear in simulation.
  • It is calculated on a per-NIBBLESLICE basis.
  • It is calculated once and does not change unless the XPHY is reset. Updating an input delay does not change the value of align_delay. However, if the update would adjust the delay line to fewer taps than align_delay, the delay is loaded, and align_delay's original value is preserved. For example, if align_delay is 10 taps, and 5 taps are loaded through CNTVALUEIN, CNTVALUEOUT would reflect 5 taps (note that these taps would not be compensated for by VTC because they are less than align_delay). If the delay line is later updated to 20 taps, VTC would compensate for 10 taps, because align_delay would still be its original value of 10 taps (so 20 taps – 10 taps = 10 taps compensated for by VTC).
  • It only exists for input delays and can be considered zero for output delays or when BISC is not used (SELF_CALIBRATE = DISABLE).
  • When an input delay is reported through CNTVALUEOUT, it includes align_delay. For example, if DELAY_VALUE_x ends up being 100 taps and align_delay is 10 taps, CNTVALUEOUT would report 110 taps for that NIBBLESLICE.
    • Note that because CNTVALUEOUT reports the total taps in a delay line, it initially reports the taps from align_delay + DELAY_VALUE_x. However, if the user updates the input delay line (e.g., loading a delay through the PL), the update is treated as the total number of taps desired. Thus, CNTVALUEOUT would report the same number of taps from the user's delay update with align_delay being a part of those taps (but not in addition to them). See the example in Figure 3 for more clarity.
  • It is not compensated for by VTC because align_delay tracks the strobe path. As VT conditions change, the strobe propagation delay also changes.
  • VTC compensates for the taps in the delay line minus align_delay.
  • VTC will not compensate below align_delay. For example, if align_delay is 10 taps, VTC will never compensate for less than 10 taps. In this sense, align_delay acts as a "floor" for VTC.
  • In some cases, particularly full-bank designs, align_delay might reach its ceiling and not properly operate. To remedy this, have the strobe/capture clock enter on XPHY 4.
    Important: If all XPHY nibbles in a bank comprise an interface (a "full-bank design"), align_delay can be as large as 300 taps. Because align_delay is stored in the input delay of each NIBBLESLICE, align_delay reduces the available delay of input delays.

The following figure illustrates the relationship between align_delay, DELAY_VALUE_#, and VTC. It starts on the left side with the input delay being composed of 100 taps from DELAY_VALUE_# and 10 taps from align_delay, for a total of 110 taps. If the user updates the delay line, the taps from align_delay are preserved as part of the update, as shown in the rightmost part of the figure.

Figure 3. Align Delay Value VTC Relationship

To find the approximate time value (ps) of a single tap for a given nibble using one NIBBLESLICE:

  1. Set DELAY_VALUE_x to a nonzero value.
  2. CNTVALUEOUT of the output delay (RXTX_SEL[x] = 1) of NIBBLESLICE[x] will only be the delay implemented by DELAY_VALUE_x. The approximate value of a single tap (ps/tap) for that nibble is then:
    Time value of a single tap = DELAY_VALUE_x / Output delay value of NIBBLESLICE[x]

Another way to find the approximate time value (ps) of a single tap for a given nibble is:

  1. Set DELAY_VALUE_a to zero and DELAY_VALUE_b to a nonzero value. CNTVALUEOUT of the input delays (RXTX_SEL[a,b] = 0) will be:
    • CNTVALUEOUT[NIBBLESLICE[a]] = align_delay
    • CNTVALUEOUT[NIBBLESLICE[b]] = DELAY_VALUE_b + align_delay
  2. The approximate value of a single tap (ps/tap) for that nibble is then:
    Time value of a single tap = (CNTVALUEOUT[NIBBLESLICE[b]] - CNTVALUEOUT[NIBBLESLICE[a]]) / DELAY_VALUE_b

Cascaded delays, implying CASCADE_x = TRUE, render the TX datapath of NIBBLESLICE[x] inoperable. When using cascaded delays, consider the following:

  • AMD recommends storing half of the total delay in the input delay and the other half in the output delay.
  • The insertion delay between clock and data is not fully compensated for in NIBBLESLICEs with CASCADE_x = TRUE. This typically results in a ~65 ps difference between the clock and data. To account for this, take ~65 ps/2 and add that value to both the p-quarter and n-quarter delays.