Partitioning Steps - 1.4 English

Vitis AI User Guide (UG1414)

Document ID

UG1414

Release Date

2021-07-22

Version

1.4 English

Loading the original graph
Partitioner can handle frozen tf.Graph, tf.GraphDef, or a path to the network file/folder. If the pb file is provided the graph should be properly frozen. Other options include model stores using tf.train.Saver and tf.saved_model.
Partitioning
In this step the subgraph specified by startnode and finalnode sets is analyzed for FPGA acceleration. This is done in multiple phases.
1. All graph nodes get partitioned into (FPGA) supported and unsupported sets using one of two method. The default (compilerFunc='SPECULATIVE') method uses rough estimate of the hardware operation tree. The second method (compilerFunc= ‘DEFINITIVE’) utilizes the hardware compiler. The latter is more accurate and can handle complex optimization schemes based on the specified options, however, it takes considerable more time to conclude the process.
2. Adjacent supported and unsupported nodes get merged into (fine grained) connected components.
3. Supported partitions get merged into maximally connected components, while maintaining the DAG property.
4. Each supported partition gets (re)compiled using hardware compiler to create runtime code, quantization info, and relevant model parameters.
5. Each supported partition subgraph is stored for visualization and debug purposes.
6. Each supported subgraph gets replaced by tf.py_func node (with naming convention fpga_func_<partition_id>) that contains all necessary python function calls to accelerate that subgraph over FPGA.
Freezing the modified graph
The modified graph gets frozen and stored with “-fpga” suffix.
Run natively in TensorFlow
The modified graph can be loaded using load_partitioned_graph method of the partitioner class. The modified graph replaces the default tensorflow graph and can be used similar to the original graph.