Choosing a Programmable Logic Interface

Zynq UltraScale+ Device Technical Reference Manual (UG1085)

Document ID
UG1085
Release Date
2022-09-15
Revision
2.3 English

This section discusses various options to connecting programmable logic (PL) to the processing system (PS). A qualitative overview of data transfer use cases is shown in Table: PL Interface Comparison followed by a detailed discussion of certain use cases.

Table 35-5:      PL Interface Comparison

Method

Benefits

Considerations

Application

APU/RPU Programmed I/O

Simple software.

Least PL resources.

Simple PL slaves.

Low bandwidth demand.

Control functions.

FPD DMA
LPD DMA

Least PL resources.

Multiple channels.

Simple PL slaves.

Coherency (LPD DMA only).

FPD DMA is not coherent.

LPD DMA is optionally coherent.

FPD DMA for data movement between PS-DDR and PL.

LPD DMA for data movement between OCM and PL and safety use-cases.

S_AXI_HPC{0,1}_FPD DMA

High throughput.

Multiple interfaces.

AXI FIFO interface with QoS-400 traffic shaping.

Hardware assisted coherency; no cache flush/invalidate in software driver.

Virtualization support with SMMU in path.

More complex PL master design.

PL design to drive AxCACHE as needed for coherency.

Impacts the CCI and degrades APU and other masters accessing memory via the CCI.

Coherent, high-performance DMA for large datasets.

S_AXI_HP{0:3}_FPD DMA

High throughput.

Multiple interfaces.

AXI FIFO interface with QoS-400 traffic shaping.

Virtualization support with SMMU in path.

Software driver to handle cache flush/invalidate.

More complex PL master design.

Non-coherent, high-performance DMA for large datasets.

S_AXI_ACP_FPD DMA

Lowest latency to L2 cache.

Two-way cache coherency.

Option to allocate into L2 cache.

Limited to 16B and 64B transactions; impacting PL DMA design.

Shares APU MPCore interconnect bandwidth.

More complex PL master design.

PL logic tightly coupled with APU.

Medium granularity CPU offload.

S_AXI_ACE_FPD DMA

Optional cache coherency.

APU can snoop into PL cached masters (two-way coherency).

Burst length limited to 64B when CCI snoops PL master.

For ACE-Lite, long bursts from PL to PS may hang the APU MPCore due to the direct path from CCI to DDR memory, impacting others waiting for memory.

Complex PL design that require support for ACE.

Cached accelerators in PL.

System cache in PL using block RAM.

S_AXI_LPD DMA

Fastest, low latency path to the OCM and TCM.

Optional CCI coherency.

SMMU in datapath provides option for virtualization.

PL access to LPD when FPD is powered off.

 

Safety applications.

The data movement use-cases in Table: PL Interface Comparison are described in the following sections.