The XIR-based compiler works in the context of a framework independent XIR graph generated from deep learning frameworks. The parser removes the framework-specific attributes in the CNN models and transforms models into XIR-based computing graphs. The compiler divides the computing graph into different subgraphs, leverages heterogeneous optimizations, and generates corresponding optimized machine codes for subgraphs.
When the model contains ops that the DPU cannot support, some subgraphs are created and mapped to the CPU. The FPGA is so powerful that you can create a specific IP to accelerate those ops and get better end-to-end performance. To enable customized accelerating IPs with an XIR-based toolchain, leverage a pipeline named plugin to extend the XIR and compiler.
In Plugin.hpp, the interface class Plugin is declared. Plugins are executed sequentially before the compiler starts to compile the graph for the DPU. In the beginning, the child subgraph is created for each operator and the plugin picks the operators which it can accelerate. It merges them into larger subgraphs, maps them to the customized IP and attaches necessary information for runtime (VART::Runner) such as the instructions on the subgraphs.
Implementing a Plugin
- Implement
Plugin::partition()
In
std::set<xir::Subgraph*> partition(xir::Graph* graph)
, you should pick the desired ops and merge them into device level subgraphs. You can use the following helper functions.-
xir::Subgraph* filter_by_name(xir::Graph* graph, const std::string& name)
returns the subgraph with a specific name -
std::set<xir::Subgraph*> filter_by_type(xir::Graph* graph, const std::string& type)
returns subgraphs with a specific type. -
std::set<xir::Subgraph*> filter_by_template(xir::Graph* graph, xir::GraphTemplate* temp)
returns subgraphs with a specific structure.Figure 2. Filter by Templates -
std::set<xir::Subgraph*> filter(xir::Graph* graph, std::function<std::set<xir::Subgraph*>(std::set<xir::Subgraph*>)> func)
allows you to filter the subgraphs by customized function. This method helps you to find all uncompiled subgraphs.
If you need to merge the children subgraphs that you get, use the helper function named
merge_subgraph()
to merge the children subgraphs. However, this function can only merge subgraphs at the same level. If the subgraph list can not be merged into one subgraph, the helper function will merge them as far as possible. -
- Specify the name, device, and runner for the subgraphs you picked in the
Plugin::partition()
function. - Implement
Plugin::compile(xir::Subgraph*)
. This function is called for all the subgraphs returned by thepartition()
function. You can do whatever you want here and attach information on subgraphs for runtime.
Building the Plugin
You need to create an extern get_plugin()
function and build
the implementations into a shared library.
extern "C" plugin* get_plugin() { return new YOURPLUGIN(); }
Using the Plugin
You can use --options '{"plugin":
"libplugin0.so,libplugin1.so"}'
in vai_c command line option to pass
your plugin library to compiler. While executing your plugin, the compiler will open
the library and make an instance of your plugin by loading your extern function
named ‘get_plugin’. If more than one plugins are specified, they would be executed
sequentially in the order defined by command line option. Compilation for DPU and
CPU would be executed after all plugins have been implemented.