Operators Supported by Caffe - 1.4.1 English

Vitis AI User Guide (UG1414)

Document ID
UG1414
Release Date
2021-12-13
Version
1.4.1 English
Table 1. Operators Supported by Caffe
Caffe XIR DPU Implementation 
OP name Attributes OP name Attributes
input  shape data  shape Allocate memory for input data. 
  data_type
convolution        kernel_size conv2d (group = 1) / depthwise-conv2d (group = input channel)        kernel If group == input channel, the convolution would be compiled into Depthwise-Convolution Engine, if group == 1, the convolution would be mapped to Convolution Engine. Otherwise, it would be mapped to CPU.       
stride stride
pad pad
  pad_mode (FLOOR)
dilation dilation
bias_term  
num_output  
group  
deconvolution        kernel_size transposed-conv2d (group = 1) / depthwise-transposed-conv2d (group = input channel)        kernel If group == input channel, the deconvolution would be compiled into Depthwise-Convolution Engine, if group == 1, the deconvolution would be mapped to Convolution Engine. Otherwise, it would be mapped to CPU.       
stride stride
pad pad
  pad_mode (FLOOR)
dilation dilation
bias_term  
num_output  
group  
innerproduct  bias_term conv2d / matmul  transpose_a The inner-product would be transformed to matmul, then the matmul would be transformed to conv2d and compiled to Convolution Engine. If the inner-product fails to be transformed, it would be implemented by CPU. 
num_output transpose_b
scale bias_term depthwise-conv2d / scale   The scale would be transformed to depthwise-convolution, otherwise, it would be mapped to CPU.
pooling       kernel_size maxpool2d (pool_method = 0) / avgpool2d (pool_method = 1)       kernel_size Pooling Engine.     
stride stride
global_pooling global
pad pad
pool_method pad_mode(CEIL)
  count_include_pad (true)
  count_include_invalid (false)
eltwise  coeff = 1 add    Element-wise Add Engine. 
operation = SUM  
concat axis concat axis Xilinx reduces the overhead resulting from the concat by special reading or writing strategies and allocate the on-chip memory carefully.
relu negative_slope relu / leakyrelu alpha Activations would be fused to adjacent operations such as convolution, add, etc. 
relu6   relu6  
fixneuron    bit_width fix    bit_width It would be divided into float2fix and fix2float during compilation, then the float2fix and fix2float operations would be fused with adjacent operations into course-grained operations.   
quantize_pos fix_point
  if_signed
  round_mode
reshape shape reshape shape These operations are shape-related operations, they would be removed or transformed into reshape in most cases, which would not affect the on-chip data layout. Otherwise, they would be compiled to CPU.   
permute order reshape / transpose order
flatten axis reshape / flatten start_axis
  end_axis   end_axis
reorg  strides reorg  strides If the reorg meets the hardware requirements, it would be mapped to DPU implementations. 
reverse reverse
deephiresize    scale resize    size If the mode of the resize is 'BILINEAR', align_corner=false, half_pixel_centers = false, size = 2, 4, 8; align_corner=false, half_pixel_centers = true, size = 2, 4 can be transformed to DPU implementations (pad+depthwise-transposed conv2d). If the mode of the resize is 'NEAREST' and the size is an integer, the resize would be mapped to DPU implementations.   
mode mode
  align_corners=false
  half_pixel_centers=false
gstiling  strides gstiling  stride If the strides of gstiling are integers, it may be mapped into special DPU read/write instructions. 
reverse reverse
slice   axis strided_slice   begin They would only be compiled into CPU implementations.           
slice_point end
  strides
priorbox        min_sizes priorbox        min_sizes
max_sizes max_sizes
aspect_ratio aspect_ratio
flip flip
clip clip
variance variance
step step
offset offset
softmax axis softmax axis