Isolating the Data Error - 1.0 English

Versal Adaptive SoC Soft DDR4 SDRAM Memory Controller LogiCORE IP Product Guide (PG353)

Document ID
PG353
Release Date
2023-10-18
Version
1.0 English

Using either the Advanced Traffic Generator or the user design, the first step in data error debug is to isolate when and where the data errors occur. To perform this, the expected data and actual data must be known and compared. Looking at the data errors, the following should be identified:

  • Are the errors bit or byte errors?
    • Are errors seen on data bits belonging to certain DQS groups?
    • Are errors seen on specific DQ bits?
  • Is the data shifted, garbage, swapped, etc.?
  • Are errors seen on accesses to certain addresses, banks, or ranks of memory?
    • Designs that can support multiple varieties of DIMM modules, all possible address and bank bit combinations should be supported.
  • Do the errors only occur for certain data patterns or sequences?
    • This can indicate a shorted or open connection on the PCB. It can also indicate an SSO or crosstalk issue.
  • Determine the frequency and reproducibility of the error
    • Does the error occur on every calibration/reset?
    • Does the error occur at specific temperature or voltage conditions?
  • Determine if the error is correctable
    • Rewriting, rereading, resetting, and recalibrating.

The next step is to isolate whether the data corruption is due to writes or reads.