Currently Supported Operators

Currently Supported Operators - 2.0 English

Vitis AI User Guide (UG1414)

Document ID

UG1414

Release Date

2022-01-20

Version

2.0 English

Table 1. Currently Supported Operators
Typical Operation Type in CNN	Parameters	DPUCZDX8G_ISA0_B4096_MAX_BG2 (ZCU102, ZCU104)	DPUCAHX8L_ISA0 (U50, U50LV, U280)	DPUCVDX8G_ISA1_C32B3 (VCK190)	DPUCAHX8H_ISA2 ¹(U50, U55C, U50LV, U280)	DPUCADF8H_ISA0 (U200, U250)	DPUCVDX8H_ISA1² (VCK5000)
Intrinsic Parameter		channel_parallel: 16 bank_depth: 2048	channel_parallel: 32 bank_depth: 4096	channel_parallel: 16 bank_depth: 16384	channel_parallel: 16 bank_depth: 2048	channel_parallel: 16 bank_depth: 8192	channel_parallel: 64 bank_depth: 256
conv2d	Kernel size	w, h: [1, 16]	w, h: [1, 16]	w, h: [1, 16] w * h <= 64	w, h: [1, 16]	w, h: [1, 16]	w, h: [1, 16]
	Strides	w, h: [1, 8]	w, h: [1, 4]	w, h: [1, 8]	w, h: [1, 4]	w, h: [1, 8]	w, h: [1, 4]
	Dilation	dilation * input_channel <= 256 * channel_parallel
	Paddings	pad_left, pad_right: [0, (kernel_w - 1) * dilation_w]
	Paddings	pad_top, pad_bottom: [0, (kernel_h - 1) * dilation_h]
	In Size	kernel_w * kernel_h * ceil(input_channel / channel_parallel) <= bank_depth
	Out Size	output_channel <= 256 * channel_parallel
	Activation	ReLU, LeakyReLU, ReLU6	ReLU, ReLU6	ReLU, LeakyReLU, ReLU6, Hard-Swish, Hard-Sigmoid	ReLU, LeakyReLU, ReLU6	ReLU, LeakyReLU	ReLU, LeakyReLU, ReLU6, Hard-Swish, Hard-Sigmoid
	Group* (Caffe)	group==1
depthwise-conv2d	Kernel size	w, h: [1, 16]	w, h: [3]	w, h: [1, 256]	w, h: {1, 3}	Not supported	w, h: {1, 3, 5, 7}
	Strides	w, h: [1, 8]	w, h: [1, 2]	w, h: [1, 8]	w, h: [1, 2]		w, h : [1, 4]
	dilation	dilation * input_channel <= 256 * channel_parallel					dilation * input_channel <= 256 * channel_parallel
	Paddings	pad_left, pad_right: [0, (kernel_w - 1) * dilation_w]		pad_left, pad_right: [0, 15 * dilation_w]	pad_left, pad_right: [0, (kernel_w - 1) * dilation_w]		pad_left, pad_right: [0, (kernel_w - 1) * dilation_w]
	Paddings	pad_top, pad_bottom: [0, (kernel_h - 1) * dilation_h]		pad_top, pad_bottom: [0, 15 * dilation_h]	pad_top, pad_bottom: [0, (kernel_h - 1) * dilation_h]		pad_top, pad_bottom: [0, (kernel_h - 1) * dilation_h]
	In Size	kernel_w * kernel_h * ceil(input_channel / channel_parallel) <= bank_depth					kernel_w * kernel_h * ceil(input_channel / channel_parallel) <= bank_depth
	Out Size	output_channel <= 256 * channel_parallel					output_channel <= 256 * channel_parallel
	Activation	ReLU, ReLU6	ReLU, ReLU6	ReLU, ReLU6	ReLU, ReLU6		ReLU, ReLU6
	Group* (Caffe)	group==input_channel					group==input_channel
transposed-conv2d	Kernel size	kernel_w/stride_w, kernel_h/stride_h: [1, 16]
	Strides	kernel_w/stride_w, kernel_h/stride_h: [1, 16]
	Paddings	pad_left, pad_right: [1, kernel_w-1]
	Paddings	pad_top, pad_bottom: [1, kernel_h-1]
	Out Size	output_channel <= 256 * channel_parallel
	Activation	ReLU, LeakyReLU, ReLU6	ReLU, ReLU6	ReLU, LeakyReLU, ReLU6, Hard-Swish, Hard-Sigmoid	ReLU, LeakyReLU, ReLU6	ReLU, LeakyReLU	ReLU, LeakyReLU, Hard-Swish, Hard-Sigmoid
depthwise-transposed-conv2d	Kernel size	kernel_w/stride_w, kernel_h/stride_h: [1, 16]	kernel_w/stride_w, kernel_h/stride_h: [3]	kernel_w/stride_w, kernel_h/stride_h: [1, 256]	kernel_w/stride_w, kernel_h/stride_h: {1, 3}	Not supported	kernel_w/stride_w, kernel_h/stride_h: {1, 3,5, 7}
	Strides	kernel_w/stride_w, kernel_h/stride_h: [1, 16]	kernel_w/stride_w, kernel_h/stride_h: [3]	kernel_w/stride_w, kernel_h/stride_h: [1, 256]	kernel_w/stride_w, kernel_h/stride_h: {1, 3}		kernel_w/stride_w, kernel_h/stride_h: {1, 3,5, 7}
	Paddings	pad_left, pad_right: [1, kernel_w-1]		pad_left, pad_right: [1, 15]	pad_left, pad_right: [1, kernel_w-1]		pad_left, pad_right: [1, kernel_w-1]
	Paddings	pad_top, pad_bottom: [1, kernel_h-1]		pad_top, pad_bottom: [1, 15]	pad_top, pad_bottom: [1, kernel_h-1]		pad_top, pad_bottom: [1, kernel_h-1]
	Out Size	output_channel <= 256 * channel_parallel					output_channel <= 256 * channel_parallel
	Activation	ReLU, ReLU6	ReLU, ReLU6	ReLU, ReLU6	ReLU, ReLU6		ReLU, ReLU6
max-pooling	Kernel size	w, h: [2, 8]	w, h: {2, 3, 5, 7, 8}	w, h: [1, 256]	w, h: [1, 8]	w, h: [1, 16]	w, h: [1, 8]³
	Strides	w, h: [1, 8]	w, h: [1, 8]	w, h: [1, 8]	w, h: [1, 8]	w, h: [1, 8]	w, h: [1, 8]
	Paddings	pad_left, pad_right: [1, kernel_w-1]		pad_left, pad_right: [1, 15]	pad_left, pad_right: [1, kernel_w-1]
	Paddings	pad_top, pad_bottom: [1, kernel_h-1]		pad_top, pad_bottom: [1, 15]	pad_top, pad_bottom: [1, kernel_h-1]
	Activation	ReLU	not supported	ReLU, ReLU6	not supported	ReLU	not supported
average-pooling	Kernel size	w, h: [2, 8] w==h	w, h: {2, 3, 5, 7, 8} w==h	w, h: [1, 256]	w, h: [1, 8] w==h	w, h: [1, 16]	w, h: [1, 8]³ w==h
	Strides	w, h: [1, 8]	w, h: [1, 8]	w, h: [1, 8]	w, h: [1, 8]	w, h: [1, 8]	w, h: [1, 8]
	Paddings	pad_left, pad_right: [1, kernel_w-1]		pad_left, pad_right: [1, 15]	pad_left, pad_right: [1, kernel_w-1]
	Paddings	pad_top, pad_bottom: [1, kernel_h-1]		pad_top, pad_bottom: [1, 15]	pad_top, pad_bottom: [1, kernel_h-1]
	Activation	ReLU	not supported	ReLU, ReLU6	not supported	ReLU	not supported
eltwise	type	sum	sum	sum, prod	sum	sum	sum
	Input Channel	input_channel <= 256 * channel_parallel
	Activation	ReLU	ReLU	ReLU	ReLU	ReLU	ReLU
concat	Network-specific limitation, which relates to the size of feature maps, quantization results and compiler optimizations.
reorg	Strides	reverse==false : stride ^ 2 * input_channel <= 256 * channel_parallel reverse==true : input_channel <= 256 * channel_parallel
pad	In Size	input_channel <= 256 * channel_parallel
pad	Mode	"SYMMETRIC" ("CONSTANT" pad(value=0) would be fused into adjacent operators during compiler optimization process)
global pooling	Global pooling will be processed as general pooling with kernel size euqal to input tensor size.
InnerProduct, Fully Connected, Matmul	These ops will be transformed into conv2d op
5PE doesn't have depthwise-conv2d engine. 8PE doesn't have depthwise-conv2d engine. This is for 6PE. For 8PE, pooling kernel size supports {1, 2, 3, 7}.