An HLS function first and foremost is a function. It has a predetermined
number of inputs and outputs and every time the function is invoked, it consumes the
inputs and produces the predetermined number of outputs. If an HLS function imported
using xmcImportFunction
hangs, (for example, if it has
an infinite loop), Simulink will also hang, waiting
indefinitely for the output from the imported block. This is because an imported HLS
function using xmcImportFunction
runs on the same
thread as Simulink. If the imported functions hangs,
Simulink also hangs.
HLS Kernels are IPs
When you import an HLS function into a design by itself, the HLS function will not operate as an IP with streaming ports. In Vitis Model Composer, you need to enclose the HLS function in a subsystem (perhaps along with other HLS blocks) and use the Interface Spec block to designate streaming ports for the design. You can then generate an HLS IP.
Unlike an imported HLS function, an HLS Kernel is a proper HLS IP that can be used in AMD Vitis™ HLS and be synthesized directly. The following code snippet highlights the HLS kernel code with streaming interface.
hls_kernel.cc
void hls_kernel_blk(
hls::stream<ap_axis<64, 0, 0, 0> > & in_sample1,
hls::stream<ap_axis<64, 0, 0, 0> > & in_sample2,
hls::stream<ap_axis<64, 0, 0, 0> > & out0_itr1,
hls::stream<ap_axis<64, 0, 0, 0> > & out1_itr1
)
{
#pragma HLS PIPELINE II=1
#pragma HLS INTERFACE ap_ctrl_none port=return
#pragma HLS INTERFACE axis register both port=out1_itr1
#pragma HLS INTERFACE axis register both port=out0_itr1
#pragma HLS INTERFACE axis register both port=in_sample1
#pragma HLS INTERFACE axis register both port=in_sample2
ap_int64 in_samp0 ; // Iteration-1: 2 complex samples concatenated to 64-bit
ap_int64 in_samp1 ; // Iteration-2: 2 complex samples concatenated to 64-bit
...
In this example, notice the function signature and also the HLS pragmas specifying the interface on the ports. This function has all the constructs required by the HLS IP. You can directly import the code above into Model Composer using the HLS Kernel block, and then simulate it.
An HLS Kernel block can accept variable size signals allowing it to connect to
AI Engine blocks and also produces variable
size output signals. Unlike a block that is imported using xmcImportFunction
, the HLS Kernel block runs on a separate thread than
Simulink and as such, the presence of a blocking read call
in the HLS kernel code will not cause Simulink to hang when the
input variable size signal is empty, instead the block will also produce an empty
variable size output signal.