Partitioning Arrays to Improve Pipelining - 2021.2 English

Vitis High-Level Synthesis User Guide (UG1399)

Document ID
UG1399
ft:locale
English (United States)
Release Date
2021-12-15
Version
2021.2 English

A common issue when pipelining functions is the following message:

INFO: [SCHED 204-61] Pipelining loop 'SUM_LOOP'.
WARNING: [SCHED 204-69] Unable to schedule 'load' operation ('mem_load_2', 
bottleneck.c:62) on array 'mem' due to limited memory ports.
WARNING: [SCHED 204-69] The resource limit of core:RAM:mem:p0 is 1, current 
assignments: 
WARNING: [SCHED 204-69]     'load' operation ('mem_load', bottleneck.c:62) on array 
'mem',
WARNING: [SCHED 204-69] The resource limit of core:RAM:mem:p1 is 1, current 
assignments: 
WARNING: [SCHED 204-69]     'load' operation ('mem_load_1', bottleneck.c:62) on array 
'mem',
INFO: [SCHED 204-61] Pipelining result: Target II: 1, Final II: 2, Depth: 3.

In this example, Vitis HLS states it cannot reach the specified initiation interval (II) of 1 because it cannot schedule a load (read) operation (mem_load_2) onto the memory because of limited memory ports. The above message notes that the resource limit for "core:RAM:mem:p0 is 1" which is used by the operation mem_load on line 62. The second port of the block RAM also only has 1 resource, which is also used by operation mem_load_1. Due to this memory port contention, Vitis HLS reports a final II of 2 instead of the desired 1.

This issue is typically caused by arrays. Arrays are implemented as block RAM which only has a maximum of two data ports. This can limit the throughput of a read/write (or load/store) intensive algorithm. The bandwidth can be improved by splitting the array (a single block RAM resource) into multiple smaller arrays (multiple block RAMs), effectively increasing the number of ports.

Arrays are partitioned using the ARRAY_PARTITION directive. Vitis HLS provides three types of array partitioning, as shown in the following figure. The three styles of partitioning are:

block
The original array is split into equally sized blocks of consecutive elements of the original array.
cyclic
The original array is split into equally sized blocks interleaving the elements of the original array.
complete
The default operation is to split the array into its individual elements. This corresponds to resolving a memory into registers.
Figure 1. Array Partitioning

For block and cyclic partitioning the factor option specifies the number of arrays that are created. In the preceding figure, a factor of 2 is used, that is, the array is divided into two smaller arrays. If the number of elements in the array is not an integer multiple of the factor, the final array has fewer elements.

When partitioning multi-dimensional arrays, the dimension option is used to specify which dimension is partitioned. The following figure shows how the dimension option is used to partition the following example code:

void foo (...) {
 int  my_array[10][6][4];
   ...   
}

The examples in the figure demonstrate how partitioning dimension 3 results in 4 separate arrays and partitioning dimension 1 results in 10 separate arrays. If zero is specified as the dimension, all dimensions are partitioned.

Figure 2. Partitioning Array Dimensions