Currently, Xilinx devices on Data Center accelerator cards use stacked silicon consisting of several Super Logic Regions (SLRs) to provide device resources, including global memory. For best performance, when assigning ports to global memory banks, as described in Mapping Kernel Ports to Memory, it is best that the CU instance is assigned to the same SLR as the global memory it is connected to. In this case, you will want to manually assign the kernel instance, or CU into the same SLR as the global memory to ensure the best performance. In addition, if your platform or device supports multiple SLRs, then you should assign CUs to specific SLRs to improve placement and timing results.
A CU can be assigned to an SLR using the
connectivity.slr option in a config file. The syntax of the
connectivity.slr option in the config file is as follows:
[connectivity] #slr=<compute_unit_name>:<slr_ID> slr=vadd_1:SLR2 slr=vadd_2:SLR3
<compute_unit_name>is an instance name of the CU as determined by the
connectivity.nkoption, described in Creating Multiple Instances of a Kernel, or is simply
<kernel_name>_1if multiple CUs are not specified.
<slr_ID>is the SLR number to which the CU is assigned, in the form SLR0, SLR1,...
The assignment of a CU to an SLR must be specified for each CU
separately, and is recommended when the platform contains multiple SLRs. If an assigned
CU is connected to global memory located in another SLR, the tool will automatically
insert SLR crossing registers to help with timing closure. In the absence of an SLR
v++ linker is free to assign the CU to
After editing the config file to include the SLR assignments, you can
use it during the
v++ linking process by specifying
the config file using the
v++ -l --config config_slr.cfg ...