Hardware System Architecture - 2023.2 English

Vitis Unified Software Platform Documentation: Application Acceleration Development (UG1393)

Document ID
UG1393
Release Date
2023-12-13
Version
2023.2 English

The system architecture and accelerator architecture can be fully specified using an accelerator class derived on the VPP_ACC class. In the example code, the convolution filter system architecture, compute composition, number of CUs, and compute connectivity is fully defined by the conv_acc class. The following system architecture is generated.

Figure 1. Hardware Architecture

The conv_acc class specifies that the krnl_conv() function be accelerated by wrapping it inside the compute() function. Because a video has three different color channels which can be processed independently using the same functionality, the accelerator class also specifies three CUs to be used. These CUs are replicated and plugged into the memory system using data movers as specified by the data ACCESS_PATTERN macro. On the host side the figure shows two separate threads for the receive and send threads which interact with the hardware using lower level runtime drivers from VSC.

The CU functionality must be defined via the compute() function in a separate .cpp file. In this example there is only one processing element (PE), so compute() simply wraps it. The following code snippet shows a part of the .cpp file with the sub-functions details not provided here. Refer to Supported Platforms and Startup Examples for more complete examples.

// the compute() function wraps the processing function in this example
void conv_acc::compute(
        char                 *coeffs,
        float                factor,
        short                bias,
        unsigned short       width,
        unsigned short       height,
        unsigned char        *src,
        unsigned char        *dst)
{
    krnl_conv(coeffs,factor,bias,width,height,src,dst);
}
 
// ... the processing function implements the CU functionality
void conv_acc::krnl_conv(
        char                 *coeffs,
        float                factor,
        short                bias,
        unsigned short       width,
        unsigned short       height,
        unsigned char        *src,
        unsigned char        *dst) {           
    #pragma HLS DATAFLOW
 
    hls::stream<window,3>  window_stream; // Set a stream depth to 3
 
    // Read incoming pixels and form valid HxV windows
    Window2D(width, height, src, window_stream);
 
    // Process incoming stream of pixels, and stream pixels out
    Filter2D(width, height, factor, bias, coeffs, window_stream, dst);
}