Hardware Issues - 1.0 English

AXI High Bandwidth Memory Controller LogiCORE IP Product Guide (PG276)

Document ID
PG276
Release Date
2022-11-02
Version
1.0 English

If there are issues with the AXI HBM Controller IP in hardware, such as the HBM status in the Vivado Hardware Manager not being Complete or apb_complete_x never asserts, review the following:

Hardware Manager Status Not Enabled
If the HBM status in the Vivado Hardware Manager shows as Not Enabled, it is likely that there is a connection problem between the Vivado Hardware Manager and the AXI HBM Controller IP.

Ensure that the Enable Hardware Debug Interface option is selected in the Reliability Options tab of the AXI HBM Controller IP configuration options in the Vivado IDE, as shown in the following figure.

Figure 1. Enable Hardware Debug Interface

Ensure that there is a valid debug hub clock connected to the debug hub. The debug hub is a core generated by the Vivado place and route tools whenever any debug cores are detected in the design; see AR 72607 for more information.

Check the resets and reset polarities for the clock generators and APB ports in the design.

Hardware Manager Status Config Error
If the HBM status in the Vivado Hardware Manager is Config Error, this might be caused by any of the following issues:
  • A clocking issue.
  • A timing issue.
  • The IP is using the incorrect MEM files for the hardware configuration.
  • There is inadequate power for the HBM stacks.
  • Error Correction with ECC Initialization enabled.

A Config Error usually means that the Vivado Hardware Manager can read data from the HBM, but something is preventing the IP from working correctly or the debug hub cannot reliably access the APB registers.

Check the IP configuration in the tools and verify the clock sources on the board. Ensure that the APB clocks are connected in the design and that the connected clock frequency matches the IP setting. Review the timing report for the design to make sure it is clean and timing has closed. Use a local low frequency clock for the HBM debug hub. If both HBM stacks are being used, set a false path from the APB clocks:

set_false_path -from [get_clocks *APB_0_PCLK] -to [get_clocks *APB_1_PCLK]
set_false_path -from [get_clocks *APB_1_PCLK] -to [get_clocks *APB_0_PCLK]

Check the individual memory controller statuses in the Vivado Hardware Manager by expanding the HBM Core Properties for each MC. Verify which ones are reporting CTRLR_READY and CTRLR_INIT_DONE are asserted. If decreasing the frequency, disabling a stack, or disabling more memory controllers causes more of them to pass, it is likely that there is a power issue on the board.

If the results are inconsistent from power cycle to power cycle, and the power supply on the board is adequate, it is likely that there is a timing issue on the debug hub or APB paths.

Review the HBM sections of the UltraScale+™ Xilinx® Power Estimator (XPE) and verify the power solution on the board can deliver adequate power for the targeted workloads. See the XPE web page for more information.

Review the Virtex UltraScale+ FPGA Data Sheet: DC and AC Switching Characteristics (DS923) to make sure the HBM power rails are within specification.

If ECC is enabled with the ECC Initialization option, this takes a few additional seconds for the memory to perform the memory initialization after the interface has completed calibration. However the Hardware Manager is already querying the HBM for configuration status during this time thus a Configuration Error may be seen. Simply right-click on the HBM and select 'Refresh' and if this is the issue, the calibration error will be replaced with 'Complete'.

The apb_complete_x signal never asserts
If apb_complete_x never asserts, check the status in the Vivado Hardware Manager. This could be a timing, clocking, power, or configuration issue depending on the status and behavior.
The apb_complete_x signal asserts, but Vivado Hardware Manager reports Config Error or Not Enabled
If the apb_complete_x signals are asserting in the design when monitored by some other logic or ILA, this means that the HBM stacks have completed calibration and are ready for traffic, but there is an issue with the debug hub connection. This could either be a clocking issue or a timing constraint issue. In most circumstances it is appropriate to use the APB0 clock if routing is not a challenge. If routing is already congested, try generating another clock locally only for the debug hub.
Note: This error might also be caused by a timing issue with the user logic in the design. It cannot be guaranteed that the user logic connected to the AXI ports is operating correctly while the HBM stacks report a ready status.