假设某一内核根据其输入会生成不同数据量。例如,某个压缩引擎的输出大小因输入数据模式和相似性而异。主机仍可使用 clEnqueueMigrateMemObjects
来读取整体输出缓冲器,但是这并非最优方法,因为产生的存储器传输可能超出必要的量。理想情况下,主机程序读取的数据量应与内核写入的数据量完全相同。
有一种方法是由内核在开始编写输出数据时即写入输出数据的量。主机应用可以使用 2 次
clEnqueueReadBuffer
,第一次读取返回的数据量,第二次则基于第一次读取的信息来读取内核返回的精确数据量。clEnqueueReadBuffer(command_queue,device_write_ptr, CL_FALSE, 0, sizeof(int) * 1,
&kernel_write_size, 0, nullptr, &size_read_event);
clEnqueueReadBuffer(command_queue,device_write_ptr, CL_FALSE, DATA_READ_OFFSET,
kernel_write_size, host_ptr, 1, &size_read_event, &data_read_event);
相比于
clEnqueueReadBuffer
或 clEnqueueWriteBuffer
,更推荐使用 clEnqueueMigrateMemObject
,您可以通过使用子缓冲器来采用类似的方法。如以下代码样本所示。 提示: 此代码样本仅显示部分命令,用作概念演示。
//Create a small sub-buffer to read the quantity of data
cl_buffer_region buffer_info_1={0,1*sizeof(int)};
cl_mem size_info = clCreateSubBuffer (device_write_ptr, CL_MEM_WRITE_ONLY,
CL_BUFFER_CREATE_TYPE_REGION, &buffer_info_1, &err);
// Map the sub-buffer into the host space
auto size_info_host_ptr = clEnqueueMapBuffer(queue, size_info,,,, );
// Read only the sub-buffer portion
clEnqueueMigrateMemObjects(queue, 1, &size_info, CL_MIGRATE_MEM_OBJECT_HOST,,,);
// Retrive size information from the already mapped size_info_host_ptr
kernel_write_size = ...........
// Create sub-buffer to read the required amount of data
cl_buffer_region buffer_info_2={DATA_READ_OFFSET, kernel_write_size};
cl_mem buffer_seg = clCreateSubBuffer (device_write_ptr, CL_MEM_WRITE_ONLY,
CL_BUFFER_CREATE_TYPE_REGION, &buffer_info_2,&err);
// Map the subbuffer into the host space
auto read_mem_host_ptr = clEnqueueMapBuffer(queue, buffer_seg,,,);
// Migrate the subbuffer
clEnqueueMigrateMemObjects(queue, 1, &buffer_seg, CL_MIGRATE_MEM_OBJECT_HOST,,,);
// Now use the read data from already mapped read_mem_host_ptr