Kernel Optimizations

Vitis Unified Software Platform Documentation: Application Acceleration Development (UG1393)

Document ID
English (United States)
Release Date
2021.2 English

Because the kernel is running in programmable logic on the target platform, optimizing your task to the environment is an important element of application design. Most of the optimization techniques discussed in C/C++ Kernels can be applied to OpenCL kernels. Instead of applying the HLS pragmas used for C/C++ kernels, you will use the __attribute__ keyword described in OpenCL Attributes. Following is an example:

// Process the whole image 
image_traverse: for (uint idx = 0, x = 0 , y = 0  ; idx < size ; ++idx, x+= DATA_SIZE)

The example above specifies that the for loop, image_traverse, should be pipelined to improve the performance of the kernel. The target II in this case is 1. For more information, refer to xcl_pipeline_loop.

In the following code example, the watermark function uses the opencl_unroll_hint attribute to let the Vitis compiler unroll the loop to reduce latency and improve performance. However, in this case the __attribute__ is only a suggestion that the compiler can ignore if needed. For details, refer to opencl_unroll_hint.

//Unrolling below loop to process all 16 pixels concurrently
watermark: for ( int i = 0 ; i < DATA_SIZE ; i++)

For more information, review the OpenCL Attributes topics to see what specific optimizations are supported for OpenCL kernels, and review the C/C++ Kernels content to see how these optimizations can be applied in your kernel design.