1.Software sets up the SRC-Q with a buffer address pointing to the AXI domain and the appropriate buffer size in host memory.
2.Software sets up the DST-Q with a buffer address in the host and the appropriate buffer size.
3.Software sets up STAS-Q and STAD-Q in host memory.
4.On enabling the DMA, the DMA fetches the SRC and DST elements over PCIe.
5.The DMA fetches the buffer pointed to by the SRC elements over AXI (through AXI read transaction) and writes it into the address provided in DST-Q in host memory (upstream memory write).
6.On completion of the DMA transfer (encountering EOP), STAS-Q and STAD-Q are updated in host memory.
Each DMA channel provides scratchpad and doorbell registers. The doorbell register is useful to raise interrupts from PCIe to AXI domains as downstream interrupts are not supported in PCIe. The scratchpad registers can be used for communication in the case of a host CPU interrupting an AXI CPU.
Xilinx strongly recommends using the integrated DMA controller in the PS PCIe to exercise PCIe traffic.
Deadlock situations can occur when the PS PCIe shares the path between the CCI and the FPD Main Switch with an external master also targeting the PS PCIe interface. Refer to This Figure to see this path.
When the Zynq UltraScale+ MPSoC is used as an Endpoint, external DMA like FPD DMA or PL DMA IP connected to S_AXI_HP[0:3]_FPD can be used to exercise PCIe traffic. This is because they to not route traffic through the CCI to the FPD.
Do not use PL DMA IP connected to S_AXI_HPC[0:1], S_AXI_LPD, or any other PS masters like LPD DMA to exercise PCIe traffic because these masters use the shared path between the CCI and the FPD Main Switch.
When the Zynq UltraScale+ MPSoC is used as a Root Port, Xilinx recommends that PCIe link partners (Endpoints) access only the PS-DDR. They should not access any other memory like OCM or PL memory on the Root Port. It is also recommended that the GPU (if enabled) should not access Programmable Logic, because the share path between the CCI and the FPD Main Switch can result in a deadlock situation.
For more information, see Xilinx Answer 72341.