Configuring input_plio/output_plio - 2023.2 English

AI Engine Kernel and Graph Programming Guide (UG1079)

Document ID
UG1079
Release Date
2023-12-04
Version
2023.2 English

An input_plio/output_plio object can be configured to make external stream connections that cross the AI Engine to programmable logic (PL) boundary. This situation arises when a hardware platform is designed separately and the PL blocks are already instantiated inside the platform. This hardware design is exported from the Vivado tools as a package XSA and it should be specified when creating a new project in the AMD Vitis™ tools using that platform. The XSA contains a logical architecture interface specification that identifies which AI Engine I/O ports can be supported by the platform. The following is an example interface specification containing stream ports (looking from the AI Engine perspective).

Table 1. Example Logical Architecture Port Specification
AI Engine Port Annotation Type Direction Data Width Clock Frequency (MHz)
S00_AXIS Weight0 stream slave 32 300
S01_AXIS Datain0 stream slave 32 300
M00_AXIS Dataout0 stream master 32 300

This interface specification describes how the platform exports two stream input ports (slave port on the AI Engine array interface) and one stream output port (master port on the AI Engine array interface). An input_plio/output_plio attribute specification is used to represent and connect these interface ports to their respective destination or source kernel ports in data flow graph.

The following example shows how the input_plio/output_plio attributes shown in the previous table can be used in a program to read input data from a file or write output data to a file. The width and frequency of the input_plio/output_plio port are also provided in the PLIO constructor.

input_plio wts  = input_plio::create("Weight0", adf::plio_32_bits, "inputwts.txt", 300);
input_plio din  = input_plio::create("Datain0", adf::plio_32_bits, "din.txt", 300);
output_plio out = output_plio::create("Dataout0", adf::plio_32_bits, "dout.txt", 300);

When simulated, the input weights and data are read from the two supplied files and the output data is produced in the designated output file in a streaming manner.

When a hardware platform is exported, all the AI Engine to PL stream connections are already routed to specific physical channels from the PL side.

Wide Stream Data Path PLIO

Typically, the AI Engine array runs at a higher clock frequency than the internal programmable logic. The aiecompiler can be given a compiler option --pl-freq to identify the frequency at which the PL blocks are expected to run. To balance the throughput between AI Engine and internal programmable logic, it is possible to design the PL blocks for a wider stream data path (64-bit, 128-bit), which is then sequentialized automatically into a 32-bit stream on the AI Engine stream network at the AI Engine to PL interface crossing.

The following example shows how wide stream input_plio/output_plio attributes can be used in a program to read input data from a file or write output data to a file.

output_plio pl_out = output_plio::create("TestLogicalNameOut", plio_128_bits, "data/output.txt");
input_plio  pl_in  = input_plio::create("TestLogicalNameIn", plio_128_bits, "data/input.txt");
...
connect(pl_in.out[0], kernel_first.in[0]);
connect(kernel_last.out[0], pl_out.in[0]);

In the previous example, two 128-bit PLIO attributes is declared: one for input and one for output. The input_plio and output_plio are then hooked up to the graph in the usual way. Data files specified in the input_plio/output_plio attributes are then automatically opened for reading the input or writing the output respectively.

When simulating input_plio/output_plio with data files, the data should be organized to accommodate both the width of the PL block as well as the data type of the connecting port on the AI Engine block. For example, a data file representing 32-bit PL interface to an AI Engine kernel expecting int16 should be organized as two columns per row, where each column represents a 16-bit value. As another example, a data file representing 64-bit PL interface to an AI Engine kernel expecting cint16 should be organized as four columns per row, where each column represents a 16-bit real or imaginary value. The same 64-bit PL interface feeding an AI Engine kernel with int32 port would need to organize the data as two columns per row of 32-bit real values. The following examples show the format of the input file for the previously mentioned scenarios.

64-bit PL interface feeding AI Engine kernel expecting cint16
input file:
0 0 0 0
1 1 1 1
2 2 2 2

64-bit PL interface feeding AI Engine kernel expecting int32
input file:
0 0
1 1
2 2

With these wide PLIO attribute specifications, the aiecompiler automatically generates the AI Engine array interface configuration to convert a 64-bit or 128-bits data into a sequence of 32-bit words. The AXI4-Stream protocol followed with all PL IP blocks ensures that partial data can also be sent on a wider data path with the appropriate strobe signals describing which words are valid.