HLS Task Library - 2023.2 English

Vitis High-Level Synthesis User Guide (UG1399)

Document ID
UG1399
Release Date
2023-12-18
Version
2023.2 English
Important: To use hls::task objects in your code include the header file hls_task.h.

The hls::task library provides a simple way of modeling data-driven processes, allowing static instantiation of tasks with only streaming I/O (hls::stream or hls::stream_of_blocks) to eliminate the need to check for empty stream needed to model concurrent processes in C++.

Tip: hls::task also supports stable inputs for scalar values on s_axilite interfaces, or pointers to arrays on m_axi interfaces as described in Un-synchronized I/O in Data-Driven TLP.

The following is a simple example that can be found at simple_data_driven on GitHub:

void odds_and_evens(hls::stream<int> &in, hls::stream<int> &out1, hls::stream<int> &out2) {
    hls_thread_local hls::stream<int> s1; // channel connecting t1 and t2      
    hls_thread_local hls::stream<int> s2; // channel connecting t1 and t3

    // t1 infinitely runs splitter, with input in and outputs s1 and s2
    hls_thread_local hls::task t1(splitter, in, s1, s2);
    // t2 infinitely runs function odds, with input s1 and output out1    
    hls_thread_local hls::task t2(odds, s1, out1);       
    // t3 infinitely runs function evens, with input s2 and output
    hls_thread_local hls::task t3(evens, s2, out2);
}

Notice the top-level function, odds_and_evens uses streaming input and output interfaces. This is a purely streaming kernel. The top-level function includes the following:

  • s1 and s2 are thread-local streams (hls_thread_local) and are used to connect the task-channel tasks t1 and t2. These streams need to be thread-local so that they can be kept alive across top-level calls.
  • t1, t2, and t3 are the thread-local hls::task that execute the functions (splitter, odds, and evens respectively). The tasks run infinitely and process data on their input streams. No synchronization is needed.

However, this type of stream-only model does have some restrictions such as:

  • You cannot access non-local memory
  • Non-stream data, such as scalars, arrays and pointers, can be passed in as arguments, provided these ports are declared stable via the STABLE pragma. Currently top pointers with m_axi interface can be passed only with the offset=off option
    Important: Scalars, arrays and pointers will be read at unknown intervals as there is no synchronization with hls::task. So the code needs to be written such that it does not depend on when these ports are read.
  • You must explicitly describe the parallelism in the design by the specification of parallel tasks
  • An hls::task must always be instantiated in a parallel context and can not be nested in a sequential context
    • If you use hls::task in your top function, then the top function becomes a parallel context
    • You can instantiate an hls::task inside a dataflow region as it is a parallel context
    • In a sequential context, if you call a function that instantiates a task:
      • The function must be a parallel context such as a dataflow region
      • The task inputs and output streams must be produced and consumed by regular dataflow processes in the parallel context
    • Control-driven TLP can be at top, inside another control-driven, or inside a sequential region
    • Data-driven can be at top, inside another data-driven, or nested between control-driven tasks
    • Control-driven TLP cannot be inside a pipeline or DIRECTLY inside data-driven. In the latter case, it must be inside a sequential region executed by a data-driven task
    • Data-driven cannot be directly inside a sequential, pipeline or control driven
    • A sequential region, or a pipeline can only be inside the body of a control or data-driven TASK (not a control or data-driven region)

The hls::task objects can be mixed freely with standard dataflow-style function calls, which can move data in and out of memories (DRAM and BRAM). Tasks also support splitting channels (hls::split) to support one-to-many data distributions to build pools of workers that process streams of data, and merging channels (hls::merge) to support many-to-one data aggregation.