Compiling for Customized Accelerator - 3.5 English

Vitis AI User Guide (UG1414)

Document ID
UG1414
Release Date
2023-09-28
Version
3.5 English

The XIR-based compiler works in the context of a framework-independent XIR graph generated from deep learning frameworks. The parser removes framework-specific attributes in the CNN models and transforms the models into XIR-based computing graphs. The compiler divides the computing graph into different subgraphs, leverages heterogeneous optimizations, and generates optimized machine code for subgraphs.

Figure 1. Compilation Flow

When the model contains operations that the DPU cannot support, some subgraphs are created and mapped to the CPU. The FPGA is so powerful that you can create a specific IP to accelerate those operations for improved end-to-end performance. To enable customized accelerating IPs with an XIR-based toolchain, leverage a pipeline named plugin to extend the XIR and compiler.

In Plugin.hpp, the interface class Plugin is declared. Plugins are executed sequentially before the compiler starts to compile the graph for the DPU. At first, a child subgraph is created for each operator, and the plugin picks the operators to accelerate. It merges them into larger subgraphs, maps them to the customized IP, and attaches necessary information for runtime (VART::Runner), such as the instructions on the subgraphs.

Implementing a Plugin

  1. Implement Plugin::partition()

    In std::set<xir::Subgraph*> partition(xir::Graph* graph), pick the desired operations and merge them into device-level subgraphs using the following helper functions.

    • xir::Subgraph* filter_by_name(xir::Graph* graph, const std::string& name) returns the subgraph with a specific name
    • std::set<xir::Subgraph*> filter_by_type(xir::Graph* graph, const std::string& type) returns subgraphs with a specific type.
    • std::set<xir::Subgraph*> filter_by_template(xir::Graph* graph, xir::GraphTemplate* temp) returns subgraphs with a specific structure.
      Figure 2. Filter by Templates
    • std::set<xir::Subgraph*> filter(xir::Graph* graph, std::function<std::set<xir::Subgraph*>(std::set<xir::Subgraph*>)> func) allows you to filter the subgraphs by customized function. This method helps you to find all uncompiled subgraphs.

    To merge the child subgraphs, use the merge_subgraph() helper function. However, this function can only merge subgraphs at the same level. If the subgraph list can not be merged into one subgraph, the helper function merges them as far as possible.

  2. Specify the name, device, and runner for the subgraphs you picked in the Plugin::partition() function.
  3. Implement Plugin::compile(xir::Subgraph*). This function is called for all the subgraphs the ​partition()​ function returns. You can attach information on subgraphs for runtime.

Building the Plugin

Create an extern get_plugin() function and build the implementations into a shared library.

extern "C" plugin* get_plugin() { return new YOURPLUGIN(); }

Using the Plugin

Use --options '{"plugins": "plugin0,plugin1"}' in the vai_c command line option to pass your plugin library to the compiler. When executing your plugin, the compiler opens the library and makes an instance of your plugin by loading your extern function named ‘get_plugin.’ If more than one plugin is specified, they are executed sequentially in the order defined by the command line option. Compilation for DPU and CPU are executed after all the plugins have been implemented.