Addressing Floorplanning Impact on Performance - 2021.2 English

Versal ACAP System Integration and Validation Methodology Guide (UG1388)

Document ID
UG1388
Release Date
2021-11-19
Version
2021.2 English

Before you begin floorplanning, ensure that key resources are available on your Versal device. These resources can include the AI Engine array interface (or shim tile), block RAM, UltraRAM, DSP, NoC, and DDRMC.

The following figure shows the XCVC1902 in the Vivado IDE Device window with key resources highlighted. The figure also shows how the fabric memory resources are distributed across the device, such as block RAM or UltraRAM. Aligning the floorplan based on the connectivity of the input/output stream of logic is crucial to achieving better system performance. Following are some examples for input and output stream of logic:

  • AI Engine-PL-NoC-DDRMC
  • AI Engine-PL
  • PS-NoC-DDRMC
Figure 1. Versal ACAP Key Resources

The following figure shows an example of floorplanning the AI Engine and PL. In this design, there is AI Engine-PL dataflow, and the PL does not use UltraRAM resources. If only block RAM resources are used as part of the dataflow interacting with the AI Engine array interface, the AI Engine streams can be floorplanned in a region where the PL is mapped to a portion of the device without UltraRAM columns.

Figure 2. AI Engine and PL Floorplanning

The following figure addresses designs with GMIOs in the AI Engine that read/write to the DDRMC through the NoC. There are specific columns in the AI Engine that support GMIOs. The following figure shows an XCVC1902 in the Device window with 16 GMIO-capable columns. Following are considerations for accessing GMIO-capable columns:

  • Supply proper QoS requirements, allowing the NoC compiler to maximize bandwidth allocated to the paths.
  • Allow the aiecompiler to select GMIO columns and the NoC compiler to select the appropriate DDRMC location based on QoS settings.
  • If needed, constrain GMIOs to appropriate columns above the vertical NoCs (VNoCs), and also constrain DDRMC below the VNoCs to minimize latency through the NoC.
Figure 3. GMIO-Capable Columns

Following is an example of how to floorplan a design with AI Engine and PL kernels:

  • AI Engine array interface Pblock (JSON file)

    Determine the width for the DDRMCs and create a <constraints>.json file to be passed on to the aiecompiler.

  • Pblock (XDC file)

    Determine the size of the VNoCs and use the standard Vivado Vivado Design Suite XDC-based Pblock approach. For more information, see the Vivado Design Suite Properties Reference Guide (UG912) and UltraFast Design Methodology Guide for Xilinx FPGAs and SoCs (UG949).

Following is an example snippet of a <constraints>.json with the Pblock specified. This file is given as an argument to the AI Engine compiler with the --constraints option.

{
	"GlobalContraints":{
		"areaGroup":{
		"name": "unique_area_group_name",
		"exclude": true/false
		"nodeGroup": ["rx.*"],
		"tileGroup": ["(col,row):(col,row)"],
		"shimGroup": ["col: col:]
	}
}

In this example:

nodeGroup
Lists all the kernels and PLIOs to Pblock. rx* covers all instances starting with the name rx.
tileGroup
Gives the Pblock range on the AI Engine array for placing the kernels.
shimGroup
Gives the Pblock range on the AI Engine array interface for placing the array interface instance, which can be PLIOs.

Following are additional considerations:

  • If there is any data flow from the AI Engine to the PL, look at the DSP, block RAM, and UltraRAM columns, and use the AI Engine channels, which allow you to maximize the resources.

    For example, a PL module that does not require DSPs does not require AI Engine channels, which are closer to DSP column.

  • Make sure to insert the pipeline stages on the macro based on the way the stages are distributed across the device.

    For example, data flows through AI Engine-PL channels and is stored in memory. This is a standard requirement, but there are not enough memories aligned to the AI Engine-PL channels.

  • Align the floorplan based on the connectivity of the input stream of logic and output stream of logic.

    For example, align the floorplan to PL-NoC-DDR for the output stream and to DDR-NoC-PL-AI Engine for the input stream.