The work-group size in the OpenCL API standard defines the size of the ND range space that can be handled by a single invocation of a kernel compute unit. When OpenCL kernels are submitted for execution on an OpenCL device, they execute within an index space, called an ND range, which can have 1, 2, or 3 dimensions.
OpenCL kernel functions are executed exactly one time
for each point in the ND range index space. This unit of work for each point in the ND range
is called a work-item. Unlike
for loops in C, where loop
iterations are executed sequentially and in-order, an OpenCL runtime and
device is free to execute work-items in parallel and in any order.
Work-items are organized into work-groups, which are the unit of work
scheduled onto compute units. The optional WORK_GROUP_SIZE_HINT attribute is part of the
OpenCL Language Specification, and is a hint to the
compiler that indicates the work-group size value most likely to be specified by the
local_work_size argument to
clEnqueueNDRangeKernel. This allows the compiler to optimize the generated code
according to the expected value.
Place this attribute before the kernel definition, or before the primary function specified for the kernel:
__attribute__((work_group_size_hint(<X>, <Y>, <Z>)))
- <X>, <Y>, <Z>: Specifies the ND range of the kernel. This represents each dimension of a three dimensional matrix specifying the size of the work-group for the kernel.
The following example is a hint to the compiler that the kernel will most likely be executed with a work-group size of 1.
__attribute__((work_group_size_hint(1, 1, 1))) __kernel void ...