object can be configured to make external stream connections that cross the AI Engine to programmable logic (PL) boundary. This
situation arises when a hardware platform is designed separately and the PL blocks
are already instantiated inside the platform. This hardware design is exported from
the Vivado tools as a package XSA and it should
be specified when creating a new project in the
tools using that platform. The XSA contains a logical
architecture interface specification that identifies which AI Engine I/O ports can be supported by the platform. The following is
an example interface specification containing stream ports (looking from the
AI Engine perspective).
|AI Engine Port||Annotation||Type||Direction||Data Width||Clock Frequency (MHz)|
This interface specification describes how the platform
exports two stream input ports (slave port on the AI Engine array interface) and one stream output port (master port on
the AI Engine array interface). An
output_plio attribute specification is used to represent and connect
these interface ports to their respective destination or source kernel ports in data
The following example shows how the
output_plio attributes shown in the
previous table can be used in a program to read input data from a file or write
output data to a file. The width and frequency of the
output_plio port are also provided
in the PLIO constructor. The constructor syntax is described in more detail in Adaptive Data Flow Graph Specification Reference.
input_plio wts = input_plio::create("Weight0", adf::plio_32_bits, "inputwts.txt", 300); input_plio din = input_plio::create("Datain0", adf::plio_32_bits, "din.txt", 300); output_plio out = output_plio::create("Dataout0", adf::plio_32_bits, "dout.txt", 300);
During compilation, the logical architecture should be
specified using the option
--logical-arch=<filename>. This option is automatically
populated by the Vitis tools. When simulated,
the input weights and data are read from the two supplied files and the output data
is produced in the designated output file in a streaming manner.
When a hardware platform is exported, all the AI Engine to PL stream connections are already routed to specific physical channels from the PL side.
Wide Stream Data Path PLIO
Typically, the AI Engine array runs at a
higher clock frequency than the internal programmable logic. The AI Engine compiler can be given a compiler option
--pl-freq to identify the frequency at which
the PL blocks are expected to run. To balance the throughput between AI Engine and internal programmable logic, it is
possible to design the PL blocks for a wider stream data path (64-bit, 128-bit),
which is then sequentialized automatically into a 32-bit stream on the AI Engine stream network at the AI Engine to PL interface crossing.
The following example shows how wide stream
output_plio attributes can be used
in a program to read input data from a file or write output data to a file. The
constructor syntax is described in more detail in Adaptive Data Flow Graph Specification Reference.
output_plio pl_out = output_plio::create("TestLogicalNameOut", plio_128_bits, "data/output.txt"); input_plio pl_in = input_plio::create("TestLogicalNameIn", plio_128_bits, "data/input.txt"); ... connect< stream, window<16 * 4> > net0 (pl_in.out, kernel_first.in); connect< window<16 * 4>, stream > net1 (kernel_last.out, pl_out.in);
In the previous example, two 128-bit PLIO attributes is declared: one for
input and one for output. The
output_plio are then hooked up to the graph in the usual way.
Data files specified in the
attributes are then automatically opened for reading the input or writing the output
data files, the data should be organized to accommodate both the width of the PL
block as well as the data type of the connecting port on the AI Engine block. For example, a data file representing
32-bit PL interface to an AI Engine kernel
int16 should be organized as two columns
per row, where each column represents a 16-bit value. As another example, a data
file representing 64-bit PL interface to an AI Engine kernel expecting
should be organized as four columns per row, where each column represents a 16-bit
real or imaginary value. The same 64-bit PL interface feeding an AI Engine kernel with
int32 port would need to organize the data as two columns per row of
32-bit real values. The following examples show the format of the input file for the
previously mentioned scenarios.
64-bit PL interface feeding AI Engine kernel expecting cint16 input file: 0 0 0 0 1 1 1 1 2 2 2 2 64-bit PL interface feeding AI Engine kernel expecting int32 input file: 0 0 1 1 2 2
With these wide PLIO attribute specifications, the AI Engine compiler automatically generates the AI Engine array interface configuration to convert a 64-bit or 128-bits data into a sequence of 32-bit words. The AXI4-Stream protocol followed with all PL IP blocks ensures that partial data can also be sent on a wider data path with the appropriate strobe signals describing which words are valid.