Interfaces for Vitis Kernel Flow - 2021.2 English

Vitis High-Level Synthesis User Guide (UG1399)

Document ID
UG1399
ft:locale
English (United States)
Release Date
2021-12-15
Version
2021.2 English

The Vitis kernel flow provides support for compiled kernel objects (.xo) for software control from a host application and by the Xilinx Run Time (XRT). As described in Kernel Properties in the Vitis Unified Software Platform Documentation , this flow has very specific interface requirements that Vitis HLS must meet.

Vitis HLS supports memory, stream, and register interface paradigms where each paradigm follows a certain interface protocol and uses the adapter to communicate with the external world.
  • Memory Paradigm (m_axi): the data is accessed by the kernel through memory such as DDR, HBM, PLRAM/BRAM/URAM

  • Stream Paradigm (axis): the data is streamed into the kernel from another streaming source, such as video processor or another kernel, and can also be streamed out of the kernel.

  • Register Paradigm (s_axilite): The data is accessed by the kernel through register interfaces and performed by software register reads/writes.

The Vitis kernel flow implements the following interfaces by default:

C-argument type Paradigm Interface protocol (I/O/Inout)
Scalar(pass by value) Register AXI4-Lite (s_axilite)
Array Memory AXI4 Memory Mapped (m_axi)
Pointer to array Memory m_axi
Pointer to scalar Register s_axilite
Reference Register s_axilite
hls::stream Stream AXI4-Stream (axis)

As you can see from the table above, a pointer to an array is implemented as an m_axi interface for data transfer, while a pointer to a scalar is implemented using the s_axilite interface. A scalar value passed as a constant does not need read access, while a pointer to a scalar value needs both read/write access. The s_axilite interface implements an additional internal protocol depending upon the C argument type. This internal implementation can be controlled using Port-Level I/O Protocols. However, you should not modify the default port protocols in the Vitis kernel flow unless necessary.

Note: Vitis HLS will not automatically infer the default interfaces for the member elements of a struct/class when the elements require different interface types. For example, when one element of a struct requires a stream interface while another member element requires an s_axilite interface. You must explicitly define an INTERFACE pragma for each element of the struct instead of relying on the default interface assignment. If no INTERFACE pragma or directive is defined Vitis HLS will issue the following error message:
ERROR: [HLS 214-312] Vitis mode requires explicit INTERFACE 
pragmas for structs in the interface. Please add one INTERFACE pragma for each struct 
member field for argument 'd' of function 'dut(A&)' (example.cpp:19:0)

The default execution mode for Vitis kernel flow is pipelined execution, which enables overlapping execution of a kernel to improve throughput. This is specified by the ap_ctrl_chain block control protocol on the s_axilite interface.

Tip: The Vitis environment supports kernels with all of the supported block control protocols as described in Block-Level Control Protocols.

The vadd function in the following code provides an example of interface synthesis.

#define VDATA_SIZE 16

typedef struct v_datatype { unsigned int data[VDATA_SIZE]; } v_dt;

extern "C" {
void vadd(const v_dt* in1, // Read-Only Vector 1
          const v_dt* in2, // Read-Only Vector 2
          v_dt* out_r, // Output Result for Addition
          const unsigned int size // Size in integer 
) {

   unsigned int vSize = ((size - 1) / VDATA_SIZE) + 1;

   // Auto-pipeline is going to apply pipeline to this loop
   vadd1:
   for (int i = 0; i < vSize; i++) {
      vadd2:
      for (int k = 0; k < VDATA_SIZE; k++) {
         out_r[i].data[k] = in1[i].data[k] + in2[i].data[k];
      }
   }
}
}

The vadd function includes:

  • Two pointer inputs: in1 and in2
  • A pointer: out_r that the results are written to
  • A scalar value size

With the default interface synthesis settings used by Vitis HLS for the Vitis kernel flow, the design is synthesized into an RTL block with the ports and interfaces shown in the following figure.

Figure 1. RTL Ports After Default Interface Synthesis

The tool creates three types of interface ports on the RTL design to handle the flow of both data and control.

  • Clock, Reset, and Interrupt ports: ap_clk and ap_rst_n and interrupt are added to the kernel.
  • AXI4-Lite interface: s_axi_control interface which contains the scalar arguments like size, and manages address offsets for the m_axi interface, and defines the block control protocol.
  • AXI4 memory mapped interface: m_axi_gmem interface which contains the pointer arguments: in1, in2, and out_r

Details of M_AXI Interfaces for Vitis

AXI4 memory-mapped (m_axi) interfaces allow kernels to read and write data in global memory (DDR, HBM, PLRAM), Memory-mapped interfaces are a convenient way of sharing data across different elements of the accelerated application, such as between the host and kernel, or between kernels on the accelerator card. The main advantages for m_axi interfaces are listed below:
  • The interface has independent read and write channels
  • It supports burst-based accesses with potential performance of ~19 GB/s
  • It provides a queue for outstanding transactions
Understanding Burst Access
AXI4 memory-mapped interfaces support high throughput bursts of up to 4K bytes with just a single address phase. With burst mode transfers, Vitis HLS reads or writes data using a single base address followed by multiple sequential data samples, which makes this mode capable of higher data throughput. Burst mode of operation is possible when you use the C memcpy function or a pipelined for loop. Refer to Controlling AXI4 Burst Behavior or AXI Burst Transfers for more information.
Automatic Port Widening and Port Width Alignment

As discussed in Automatic Port Width Resizing, Vitis HLS has the ability to automatically widen a port width to facilitate data transfers and improve burst access, if a burst access can be seen by the tool. Therefore all the preconditions needed for bursting, as described in AXI Burst Transfers, are also needed for port resizing.

In the Vitis Kernel flow automatic port width resizing is enabled by default with the following configuration commands (notice that one command is specified as bits and the other is specified as bytes):
config_interface -m_axi_max_widen_bitwidth 512
config_interface -m_axi_alignment_byte_size 64
Rules for Offset
Important: In the Vitis kernel flow the default mode of operation is offset=direct and default_slave_interface=s_axilite and should not be changed.

The correct specification of the offset will let the HLS kernel correctly integrate into the Vitis system. Refer to Offset and Modes of Operation for more information.

Bundle Interfaces - Performance vs. Resource Utilization

By default, Vitis HLS groups function arguments with compatible options into a single m_axi interface adapter as described in M_AXI Bundles. Bundling ports into a single interface helps save device resources by eliminating AXI4 logic, which can be necessary when working in congested designs.

However, a single interface bundle can limit the performance of the kernel because all the memory transfers have to go through a single interface. The m_axi interface has independent READ and WRITE channels, so a single interface can read and write simultaneously, though only at one location. Using multiple bundles lets you increase the bandwidth and throughput of the kernel by creating multiple interfaces to connect to memory banks.

Details of S_AXILITE Interfaces for Vitis

In C++, a function starts to process data when the function is called from a parent function. The function call is pushed onto the stack when called, and removed from the stack when processing is complete to return control to the calling function. This process ensures the parent knows the status of the child.

Since the host and kernel occupy two separate compute spaces in the Vitis kernel flow, the "stack" is managed by the Xilinx Run Time (XRT), and communication is managed through the s_axilite interface. The kernel is software controlled through XRT by reading and writing the control registers of an s_axilite interface as described in S_AXILITE Control Register Map. The interface provides the following features:

Control Protocols
The block control protocol defines control registers in the s_axilite interface that let you set control signals to manage execution and operation of the kernel.
Scalar Arguments
Scalar inputs on a kernel are typical, and can be thought of as programming constants or parameters. The host application transfers these values through the s_axilite interface.
Pointers to Scalar Arguments
Vitis HLS lets you read to or write from a pointer to a scalar value when assigned to an s_axilite interface. Pointers are assigned by default to m_axi interfaces, so this requires you to manually assign the pointer to the s_axilite using the INTERFACE pragma or directive:
int top(int *a, int *b) {
#pragma HLS interface s_axilite port=a
Rules for Offset
Note: The Vitis kernel flow determines the required offsets. Do not specify the offset option in that flow.
Rules for Bundle
The Vitis kernel flow supports only a single s_axilite interface, which means that all s_axilite interfaces must be bundled together.
  • When no bundle is specified the tool automatically creates a default bundle named Control.
  • If for some reason you want to manually specify the bundle name, you must apply the same bundle to all s_axilite interfaces to create a single bundle.

Details of AXIS Interfaces for Vitis

The AXI4-Stream protocol (AXIS) defines a single uni-directional channel for streaming data in a sequential manner. The AXI4-Stream interfaces can burst an unlimited amount of data, which significantly improves performance. Unlike the AXI4 memory-mapped interface which needs an address to read/write the memory, the AXIS interface simply passes data to another AXIS interface without needing an address, and so uses fewer device resources. Combined, these features make the streaming interface a light-weight high performance interface.

The AXI4-Stream works on an industry-standard ready/valid handshake between a producer and consumer, as shown in the figure below. The data transfer is started once the producer sends the TVALID signal, and the consumer responds by sending the TREADY signal. This handshake of data and control should continue until either TREADY or TVALID are set low, or the producer asserts the TLAST signal indicating it is the last data packet of the transfer.

Figure 2. AXI4-Stream Handshake
Important: The AXIS interface can only be assigned to the top-level arguments (ports) of a kernel or IP, and cannot be assigned to the arguments of functions internal to the design. Streaming channels used inside the HLS design should use hls::stream and not an AXIS interface.

You should define the streaming data type using hls::stream<T_data_type>, and use the ap_axis struct type to implement the AXIS interface. As explained in AXI4-Stream Interfaces the ap_axis struct lets you choose the implementation of the interface as with or without side-channels:

Tip: You should not define your own struct for modeling the AXIS signals (side channels, TLAST, TVALID). Instead you can overload the TDATA signal for implementing your data type .