Data Bandwidth and Performance Tuning - 3.0 English

Versal ACAP CPM DMA and Bridge Mode for PCI Express Product Guide (PG347)

Document ID
PG347
Release Date
2022-06-15
Version
3.0 English

The CPM offers a few different main data interfaces for user to use depending on the CPM subsystem functional mode being used. The following table shows the available data interfaces to be used as the primary data transfer interface for each functional mode.

Table 1. Available Data Interface for each CPM Subsystem Functional Mode
Functional Mode CPM_PCIE_NOC_0/1 NOC_CPM_PCIE_0 CPM_PL_AXI_0/1 AXI4 ST C2H/H2C
CPM4 QDMA Yes (both) No N/A Yes
CPM5 QDMA Yes (both) No Yes (both) Yes
CPM4 AXI Bridge Yes (only one) Yes N/A No
CPM5 AXI Bridge Yes (only one) Yes Yes (Only one) No
CPM4 XDMA Yes No N/A Yes
Note: Certain data interface may be unavailable based on the selected feature set for that particular functional mode. For more details on these restrictions, please refer to the port description in the associated CPM subsystems section.
Note: Some data interface are shared with more than one feature set. Therefore, even though a particular mode does not use certain data interfaces, those interfaces may still be enabled and visible at the CPM boundary for other use.
The raw capacity for each AXI4 data interface is determined by multiplying the data width and the clock frequency. The net bandwidth will be dependent on the packet overhead incurred for that packet type.
  • CPM_PCIE_NOC and NOC_CPM_PCIE: Fixed 128-bit wide at CPM_TOPSW_CLK frequency. The maximum frequency will be dependent on the device speed grade.
  • CPM_PL_AXI: Configurable 64/128/256/512-bit wide at cpm_pl_axi0_clk or cpm_pl_axi1_clk pin frequency (up to 250MHz).
  • AXI4 ST C2H/H2C: Configurable 64/128/256/512-bit wide at dma_intrfc_clk pin frequency (up to 250MHz).

The raw capacity for PCIe link is determined by multiplying the number of PCIe lanes (x1/x2/x4/x8/x16) and its link speed (Gen1/Gen2/Gen3/Gen4/Gen5). The overhead of the link comes from the link layer encoding and Ordered Sets, CRC fields, packet framing, TLP headers and prefixes, and data bus alignment.

In the event that a particular PCIe link configuration has a higher bandwidth than the available data bus capacity of the AXI4 interface, more than one AXI4 interface must be used to sustain the maximum link throughput. This can be achieved in various ways, here are some examples:
  • Load balance data transfer by allocating half of the enabled DMA queues or DMA channels to interface #0, and the other half to interface #1.
  • Share the available PCIe link bandwidth among different types of transfers. DMA streaming will use AXI4 ST C2H/H2C interface while DMA Memory Mapped uses CPM_PCIE_NOC or CPM_PL_AXI interfaces.
  • AXI Bridge functional mode alone may not be able to sustain the full PCIe link bandwidth in some link and device configurations due to its inability to utilize more than one interface. Therefore this functional mode shall be restricted to control and status accesses only and is not intended to be used as a primary data mover. It can however be paired with one of the DMA functional mode to make use of the remaining available link bandwidth.

Users must also analyze the potential of head of line blocking or the request and response buffer size for each interface and ensure that data transfer initiated within an a system does not cause cyclic dependencies between interfaces or different transfers. PCIe and AXI specifications have data types, IDs, and request/response ordering requirement and CPM will uphold those requirements. For CPM_PCIE_NOC and NOC_CPM_PCIE interfaces, please refer to Versal ACAP Programmable Network on Chip and Integrated Memory Controller LogiCORE IP Product Guide (PG313) for more details. CPM_PL_AXI_0/1 and AXI4 ST C2H/H2C interfaces are direct interfaces to user PL region and gives user flexibility to attach their own data buffer and interconnect as required.