Partitioning Steps - 1.4 English

Vitis AI User Guide (UG1414)

Document ID
UG1414
Release Date
2021-07-22
Version
1.4 English
  1. Loading the original graph

    Partitioner can handle frozen tf.Graph, tf.GraphDef, or a path to the network file/folder. If the pb file is provided the graph should be properly frozen. Other options include model stores using tf.train.Saver and tf.saved_model.

  2. Partitioning

    In this step the subgraph specified by startnode and finalnode sets is analyzed for FPGA acceleration. This is done in multiple phases.

    1. All graph nodes get partitioned into (FPGA) supported and unsupported sets using one of two method. The default (compilerFunc='SPECULATIVE') method uses rough estimate of the hardware operation tree. The second method (compilerFunc= ‘DEFINITIVE’) utilizes the hardware compiler. The latter is more accurate and can handle complex optimization schemes based on the specified options, however, it takes considerable more time to conclude the process.
    2. Adjacent supported and unsupported nodes get merged into (fine grained) connected components.
    3. Supported partitions get merged into maximally connected components, while maintaining the DAG property.
    4. Each supported partition gets (re)compiled using hardware compiler to create runtime code, quantization info, and relevant model parameters.
    5. Each supported partition subgraph is stored for visualization and debug purposes.
    6. Each supported subgraph gets replaced by tf.py_func node (with naming convention fpga_func_<partition_id>) that contains all necessary python function calls to accelerate that subgraph over FPGA.
  3. Freezing the modified graph

    The modified graph gets frozen and stored with “-fpga” suffix.

  4. Run natively in TensorFlow

    The modified graph can be loaded using load_partitioned_graph method of the partitioner class. The modified graph replaces the default tensorflow graph and can be used similar to the original graph.