By default, the linker builds a single hardware instance from a kernel. If the host program will execute the same kernel multiple times, due to data processing requirements for instance, then it must execute the kernel on the hardware accelerator in a sequential manner. This can impact overall application performance. However, you can customize the kernel linking stage to instantiate multiple hardware compute units (CUs) from a single kernel. This can improve performance as the host program can now make multiple overlapping kernel calls, executing kernels concurrently by running separate compute units.
Multiple CUs of a kernel can be created by using the
connectivity.nk option in the
file during linking. Edit a config file to include the needed options, and specify it in
v++ command line with the
--config option, as described in v++ Command.
vaddkernel, two hardware instances can be implemented in the config file as follows:
[connectivity] #nk=<kernel name>:<number>:<cu_name>,<cu_name>... nk=vadd:2
- Specifies the name of the kernel to instantiate multiple times.
- The number of kernel instances, or CUs, to implement in hardware.
- Specifies the instance names for the specified number of instances. This is optional, and the CU name will default to kernel_1 when it is not specified. Notice that the delimiter between kernel instances is a comma.
- In the example above, the
numberof CUs are specified, but not the
cu_name. In this case vadd_1 and vadd_2 will be added to the design.
v++ --config vadd_config.cfg ...
xclbinutilcommand to examine the contents of the xclbin file. Refer to xclbinutil Utility.