The Vitis HLS tool synthesizes a C or C++ function into RTL code for implementation in the programmable logic (PL) region of a Versal ACAP, Zynq MPSoC, or Xilinx FPGA device. Vitis HLS is tightly integrated with both the Vivado Design Suite for synthesis, place, and route, and the Vitis core development kit for heterogeneous system-level design and application acceleration.
Vitis HLS can be used to develop and export:
- Vivado IP to be integrated into hardware designs using the Vivado Design Suite
- Vitis kernels for use in the Vitis application acceleration development flow
In the Vitis application acceleration flow, the Vitis HLS tool automates much of the code modifications required to implement and optimize the C/C++ code in programmable logic and to achieve low latency and high throughput. The inference of required pragmas to produce the right interface for your function arguments and to pipeline loops and functions within your code is the foundation of Vitis HLS.
In the Vivado IP flow, Vitis HLS also supports customization of your code to implement broader interface standards to achieve your design objectives. The RTL generated can be used as an IP directly within the Vivado tool or Model composer.
Here are the steps for the development of the C++ function.
- Architect the algorithm based on the Design Principles
- (C-Simulation) Verify the C/C++ Code with the C/C++ testbench
- (C-Synthesis) Generate the RTL using HLS
- (Co-Simulation) Verify the kernel generated with C++ outputs
- (Analyze) Review the HLS synthesis reports and co-simulation reports, analyze
- Re-run previous steps until performance goals are met.
Vitis HLS implements the solution based on the target flow, default tool configuration, design constraints, and any optimization pragmas or directives you specify. You can use optimization directives to modify and control the implementation of the internal logic and I/O ports, overriding the default behaviors of the tool.
Vitis HLS generates Vivado IP or Vitis kernel based on the target flow, default tool configuration, design constraints, and any optimization pragmas or directives you specify. You can use optimization directives to modify and control the implementation of the internal logic and I/O ports, overriding the default behaviors of the tool.
Here are some key areas related to coding and synthesizing the C++ functions in your HLS design with details covered in forthcoming sections:
- Hardware Interfaces
- The arguments of the top-level function in a Vitis HLS design are synthesized into interfaces and ports that group multiple signals to define the communication protocol between the HLS design and components external to the design. Vitis HLS defines interfaces automatically, using industry standards to specify the protocol used. The default interface protocols differ based on whether the HLS design is targeting for Vivado IP generation or the Vitis kernel. The default assignments of the interfaces can be overridden by using the INTERFACE pragma or directive.
- Controlling the Execution of HLS Design
- The execution mode of an HLS design is specified by the block-level control protocol. The HLS design can have control signals to start/stop the execution or it can be only driven when the data is available. As a designer, you do need to be aware of how your HLS design will be executed, as described in Execution Modes of HLS Designs
- Task-Level Parallelism
- To achieve high performance on the generated hardware, the HLS tool must
infer parallelism from sequential code and exploit it to achieve greater
performance. The Design Principles section introduces
the three main paradigms that need to be understood for writing good
software for FPGA platforms. Vitis HLS tool offers multiple types of task-level parallelism (TLP), either by
specifying the DATAFLOW pragma or explicitly creating parallelism using
hls::taskobject as described in Abstract Parallel Programming Model for HLS
- To achieve high performance on the generated hardware, the HLS tool must infer parallelism from sequential code and exploit it to achieve greater performance. The Design Principles section introduces the three main paradigms that need to be understood for writing good software for FPGA platforms. Vitis HLS tool offers multiple types of task-level parallelism (TLP), either by specifying the DATAFLOW pragma or explicitly creating parallelism using
- Memory Architecture
- The memory architecture is fixed in the CPU but the developer can create their own architecture to optimize the memory accesses for running applications on FPGA
- In C++ program, the arrays are fundamental data structures used to save or move the data around. In hardware, these arrays are implemented as memory or registers after synthesis. The memory can be implemented as local storage or global memory which is often DDR or HBM memory banks. Access to global memory has higher latency costs and can take many cycles while access to local memory is often quick and only takes one or more cycles.
- Often the memory is allocated/deallocated dynamically in a C++ program but this can't be synthesized in hardware. So the designer needs to be aware of the exact amount of memory required for the algorithm.
- The memory accesses should be optimized to reduce the overhead of global memory accesses. The redundant accesses, which means maximizing the use of consecutive accesses so that bursting can be inferred. The burst access hides the memory access latency and improves the memory bandwidth.
- Micro Level Optimization
- In C++ programs, there is a frequent need to implement repetitive algorithms that process blocks of data — for example, signal or image processing. Typically, the C/C++ source code tends to include several loops or several nested loops. Vitis HLS can unroll, or pipeline a loop or nested loops by inserting pragmas at appropriate levels in the source code. For more information, refer to the Loops Primer
- Once the algorithm is architected based on the design principles, inferring parallelism, you still need the right combination of micro-level HLS pragmas like PIPELINE, UNROLL, ARRAY_PARTITION, etc. These pragmas may not be intuitive to new users. Vitis HLS offers the PERFORMANCE pragma as a way to specify top-level performance goals for a given body of loop or nested loops. The tool will automatically infer the necessary lower-level pragmas to meet the goal. With PERFORMANCE pragma, fewer pragmas are needed to achieve good QoR and is an intuitive way to drive the tool.