Write Leveling - 1.0 English

Versal Adaptive SoC Programmable Network on Chip and Integrated Memory Controller 1.0 LogiCORE IP Product Guide (PG313)

Document ID
PG313
Release Date
2023-11-01
Version
1.0 English

Calibration Overview

The NoC memory modules use a fly-by topology on clocks, address, commands, and control signals to improve signal integrity. This topology causes a skew between DQS and CK at each memory device on the module. Write leveling is a feature in NoCs that allows the controller to adjust each write DQS phase independently with respect to the clock (CK) forwarded to the DDR4 device to compensate for this skew and meet the tDQSS specification.

During write leveling, DQS is driven by the adaptive SoC NoC IP and DQ is driven by the DRAM device to provide feedback. To start write leveling, a MRS command is sent to the DRAM to enable the feedback feature, while another MRS command is sent to disable write leveling at the end. The figures shows the block diagram for the write leveling implementation.

Figure 1. Write Leveling Block Diagram

Write leveling calibration logic is divided into three parts as shown in the flowchart.

Figure 2. Write Leveling Calibration Flow Chart

Detecting the Rising Edge of CK

The XPHY is set up for write leveling by setting various attributes in the RIU. WL_TRAIN is set to decouple the DQS and DQ when driving out the DQS. This allows the XPHY to capture the returning DQ from the DRAM. Because the DQ is returned without the returning DQS strobe for capture, the RX_GATE is set to 0 in the XPHY to disable DQS gate operation.

DQS is delayed with ODELAY and coarse delay (WL_DLY_CRSE [12:9] applies to all bits in a nibble) provided in the RIU WL_DLY_RNKx register. The WL_DLY_FINE [8:0] location in the RIU is used to store the ODELAY value for write leveling for a given nibble (used by the XPHY when switching ranks).

A DQS train of pulses is output by the adaptive SoC to the DRAM to detect the relationship of CK and DQS at the DDR4 memory device. DQS is delayed using the coarse taps in unit tap increments until a 0 to 1 transition is detected on the feedback DQ input. Pattern 0X1 is searched after each coarse tap increment for detecting the rising edge of CK.

If the algorithm never sees the pattern 0X1 using the coarse taps, the ODELAY of the DQS is set to an offset value (first set at 45°, BRAM_WRLVL_OFFSET_RANK*_BYTE*) and the coarse taps are checked again from 0 (the algorithm might need to perform this if the noise region is close to 90° or there is a large amount of DCD). If the transition is still not found, the offset is halved, and the algorithm tries again. If even after using all the ODELAY and coarse tap it does not see the pattern 0X1 then write leveling calibration error is issued.

The number of ODELAY taps used is determined by the initial alignment of the DQS and CK and the size of this noise region as shown in the figure.

Figure 3. Worst Case ODELAY Taps (Maximum and Minimum)

After finding the rising edge of the CK, coarse tap is reverted back to the last stable 0 seen just before the rising edge and coarse tap value is saved as BRAM_WRLVL_CRSE_STG1_RANK*_BYTE*.

Stable 0 Confirmation Before Rising Edge of CK

Confirmation of stable 0 before the rising edge of CK is required before starting the noise window detection phase. Confirmation of the stable 0 tells that fine tap increment will start from left most edge of the noise.

For confirming the stable 0 few fine taps are incremented and if it still sees stable 0 for all the fine taps then it means it is in the stable 0 region before the noise of 0 to 1 transaction. If it does not find the stable 0 this way then one coarse tap is decremented, and stable 0 is confirmed again. If it does not find stable 0 even after decrementing coarse tap, it indicates error.

Noise Window Detection and Centering

The fine taps are incremented until a non-zero value is returned. This is recorded as the left edge of the unstable region (BRAM_WRLVL_FINE_LEFT_RANK*_BYTE*). The fine taps are incremented again until all samples taken return a 1. This is recorded as the right edge of the uncertain region (BRAM_WRLVL_FINE_RIGHT_RANK*_BYTE*). Various write leveling regions are shown in the figure.

Figure 4. Write Leveling Regions

After finding both the edges of the noise region, DQS is centered in the noise region. The final fine tap is computed as the midpoint of the uncertain region, odelay – MIN_VALID_CNT – ((right_edge_taps – left_edge_taps) / 2).

After the final ODELAY setting is found, the value of ODELAY is loaded in the RIU in the WL_DLY_RNKx[8:0] register and BRAM_WRLVL_FINE_FINAL_RANK*_BYTE*. This value is also loaded in the ODELAY register for the DQ and the DM to match the DQS.

After write leveling, the MPR command is sent to the DRAM to disable the write leveling feature, the WL_TRAIN is set back to the default OFF setting, and the DQS gate is turned back on to allow for capture of the DQ with the returning strobe DQS.

CAL_ERROR Decode for Write Leveling Calibration

The status of Write Leveling can also be determined by decoding the CAL_ERROR result according to the following table.

Table 1. CAL_ERROR Decode for Write Leveling Calibration
Error Code Description Recommended Debug Step
0x9 Write leveling failed to find rising edge using coarse and fine offset. For failures on the second rank of a multi-rank DIMM, check if the DIMM uses mirroring and make sure the design generated matches what the DIMM expects. Check the pinout and connections of the address/control bus, specifically A7 which is used to set the write leveling mode in the DRAM.
0xA Write leveling failed in stable 0 confirmation stage. Check the BISC values in XSDB (for the nibbles associated with the DQS) to determine the 90° offset value in taps.
0xB Write leveling reached maximum taps to find noise width by incrementing DQS ODELAY. Check the BISC values in XSDB (for the nibbles associated with the DQS) to determine the 90° offset value in taps.
Table 2. Write Leveling Registers
Register Name Quantity Description
Fx_WRLVL_CRSE_STG1 Rank and byte Is this the Coarse Tap setting after finding the stable 1
Fx_WRLVL_OFFSET Rank and byte 90° offset in fine taps
Fx_WRLVL_CRSE_FINAL Rank and byte Stable 0 in coarse taps
Fx_WRLVL_NOISE_FCRSE Rank and byte Output delay value for valid-to-noise
Fx_WRLVL_FINE_LEFT Rank and byte The Fine Tap setting for the left side of the noise window
Fx_WRLVL_FINE_RIGHT Rank and byte The Fine Tap setting for the right side of the noise window
Fx_WRLVL_FINE_FINAL Rank and byte The final Fine Tap setting