Compiling for Customized Accelerator

Compiling for Customized Accelerator - 1.3 English

Vitis AI User Guide (UG1414)

Document ID

UG1414

Release Date

2021-02-03

Version

1.3 English

The XIR-based compiler works in the context of a framework independent XIR graph generated from deep learning frameworks. The parser removes the framework-specific attributes in the CNN models and transforms models into XIR-based computing graphs. The compiler divides the computing graph into different subgraphs, leverages heterogeneous optimizations, and generates corresponding optimized machine codes for subgraphs.

Figure 1. Compilation Flow

When the model contains ops that the DPU cannot support, some subgraphs are created and mapped to the CPU. The FPGA is so powerful that you can create a specific IP to accelerate those ops and get better end-to-end performance. To enable customized accelerating IPs with an XIR-based toolchain, leverage a pipeline named plugin to extend the XIR and compiler.

In Plugin.hpp, the interface class Plugin is declared. Plugins are executed sequentially before the compiler starts to compile the graph for the DPU. In the beginning, the child subgraph is created for each operator and the plugin picks the operators which it can accelerate. It merges them into larger subgraphs, maps them to the customized IP and attaches necessary information for runtime (VART::Runner) such as the instructions on the subgraphs.

Implementing a Plugin

Implement Plugin::partition()
In std::set<xir::Subgraph*> partition(xir::Graph* graph), you should pick the desired ops and merge them into device level subgraphs. You can use the following helper functions.
- xir::Subgraph* filter_by_name(xir::Graph* graph, const std::string& name) returns the subgraph with a specific name
- std::set<xir::Subgraph*> filter_by_type(xir::Graph* graph, const std::string& type) returns subgraphs with a specific type.
- std::set<xir::Subgraph*> filter_by_template(xir::Graph* graph, xir::GraphTemplate* temp) returns subgraphs with a specific structure.
  Figure 2. Filter by Templates
- std::set<xir::Subgraph*> filter(xir::Graph* graph, std::function<std::set<xir::Subgraph*>(std::set<xir::Subgraph*>)> func) allows you to filter the subgraphs by customized function. This method helps you to find all uncompiled subgraphs.
If you need to merge the children subgraphs that you get, use the helper function named merge_subgraph() to merge the children subgraphs. However, this function can only merge subgraphs at the same level. If the subgraph list can not be merged into one subgraph, the helper function will merge them as far as possible.
Specify the name, device, and runner for the subgraphs you picked in the Plugin::partition() function.
Implement Plugin::compile(xir::Subgraph*). This function is called for all the subgraphs returned by the partition() function. You can do whatever you want here and attach information on subgraphs for runtime.

Building the Plugin

You need to create an extern get_plugin() function and build the implementations into a shared library.

extern "C" plugin* get_plugin() { return new YOURPLUGIN(); }

Using the Plugin

You can use --options '{"plugin": "libplugin0.so,libplugin1.so"}' in vai_c command line option to pass your plugin library to compiler. While executing your plugin, the compiler will open the library and make an instance of your plugin by loading your extern function named ‘get_plugin’. If more than one plugins are specified, they would be executed sequentially in the order defined by command line option. Compilation for DPU and CPU would be executed after all plugins have been implemented.