Sub-Devices - 2021.2 English

Vitis Unified Software Platform Documentation: Application Acceleration Development (UG1393)

Document ID
UG1393
Release Date
2022-03-29
Version
2021.2 English

In the Vitis core development kit, sometimes devices contain multiple kernel instances of a single kernel or of different kernels. While the OpenCL API clCreateSubDevices allows the host code to divide a device into multiple sub-devices, the Vitis core development kit supports equally divided sub-devices (using CL_DEVICE_PARTITION_EQUALLY ), each containing one kernel instance.

The following example shows:

  1. Sub-devices created by equal partition to execute one kernel instance per sub-device.
  2. Iterating over the sub-device list and using a separate context and command queue to execute the kernel on each of them.
  3. The API related to kernel execution (and corresponding buffer related) code is not shown for the sake of simplicity, but would be described inside the function run_cu.
cl_uint num_devices = 0;
  cl_device_partition_property props[3] = {CL_DEVICE_PARTITION_EQUALLY,1,0};
  
  // Get the number of sub-devices
  clCreateSubDevices(device,props,0,nullptr,&num_devices);  
  
  // Container to hold the sub-devices
  std::vector<cl_device_id> devices(num_devices);  

  // Second call of clCreateSubDevices    
  // We get sub-device handles in devices.data()
  clCreateSubDevices(device,props,num_devices,devices.data(),nullptr); 

  // Iterating over sub-devices
  std::for_each(devices.begin(),devices.end(),[kernel](cl_device_id sdev) {
      
	  // Context for sub-device
      auto context = clCreateContext(0,1,&sdev,nullptr,nullptr,&err);  
      
	  // Command-queue for sub-device
      auto queue = clCreateCommandQueue(context,sdev,
      CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE,&err); 
      
      // Execute the kernel on the sub-device using local context and 
	queue run_cu(context,queue,kernel); // Function not shown 
  });
Important: As shown in the example, you must create a separate context for each sub-device. Though OpenCL supports a context that can hold multiple devices and sub-devices, XRT requires each device and sub-device to have a separate context.