Partitioning is the process of splitting the inference execution of a model
between the FPGA and the host. Partitioning is necessary to execute models that contain
layers unsupported by the FPGA. Partitioning can also be useful for debugging and
exploring different computation graph partitioning and execution to meet a target
objective. Following is an example of a Resnet based SSD object detection model. Notice
the parts in the following figure, in red that is replaced by
fpga_func_0 node in the partitioned graph. The partitioned code is
complete and executes on both CPU and FPGA.
Note: This support is currently available for Alveo™ based deep learning solution.
Figure 1. Original Graph
Figure 2. Partitioned Graph