VCK5000 Versal Development Card for AI Inference - 2.0 English

Vitis AI Library User Guide (UG1354)

Document ID
UG1354
Release Date
2022-01-20
Version
2.0 English

The VCK5000 Versal development card is built on the Xilinx 7nm Versal ACAP architecture and is designed for designs requiring high throughput AI inference and signal processing compute performance. For this release, DPU core with batch=6 and batch=8 are implemented using AI Engines.

VCK5000 Performance with 6PE350 MHz DPUCVDX8H-DWC

The following table lists the throughput performance (in frames/sec or fps) for various neural network samples on the Versal ACAP VCK5000 Gen3x16 with DPUCVDX8H-DWC running at 6PE@350 MHz.

Table 1. VCK5000 Performance with 6PE350 MHz DPUCVDX8H-DWC
No Neural Network Input Size GOPS DPU Frequency (MHz) Performance (fps) (Multiple thread)
1 densebox_320_320 320x320 0.49 350 4581.78
2 densebox_640_360 360x640 1.1 350 2339.43
3 drunet_pt 528x608 2.59 350 153.403
4 efficientNet-edgetpu-L_tf 300x300 19.36 350 427.568
5 efficientNet-edgetpu-M_tf 240x240 7.34 350 1100.45
6 efficientNet-edgetpu-S_tf 224x224 4.72 350 1858.11
7 ENet_cityscapes_pt 512x1024 8.6 350 141.943
8 face_landmark 96x72 0.14 350 22446.8
9 face-quality 80x60 0.06 350 31512.7
10 face-quality_pt 80x60 0.06 350 32034.8
11 facerec_resnet20 112x96 3.5 350 4498.68
12 facerec-resnet20_mixed_pt 112x96 3.5 350 4496.6
13 facerec_resnet64 112x96 11 350 2280.95
14 facereid-large_pt 96x96 0.5 350 21110
15 facereid-small_pt 80x80 0.09 350 33214.2
16 FairMot_pt 640x480 36 350 427.839
17 fpn 256x512 8.9 350 933.435
18 FPN_Res18_Medical_segmentation 320x320 45.3 350 426.927
19 FPN-resnet18_covid19-seg_pt 352x352 22.7 350 960.401
20 inception_resnet_v2_tf 299x299 26.4 350 492.177
21 inception_v1 224x224 3.2 350 3293.69
22 inception_v1_tf 224x224 3 350 3532.76
23 inception_v2 224x224 4 350 2613.47
24 inception_v2_tf 224x224 3.88 350 483.27
25 inception_v3 299x299 11.4 350 912.40
26 inception_v3_pt 299x299 5.7 350 910.16
27 inception_v3_tf 299x299 11.5 350 913.67
28 inception_v3_tf2 299x299 11.5 350 962.69
29 inception_v4 299x299 24.5 350 506.92
30 inception_v4_2016_09_09_tf 299x299 24.6 350 506.77
31 medical_seg_cell_tf2 128x128 5.3 350 1358.26
31 medical_seg_cell_tf2 128x128 5.3 350 1358.26
32 MLPerf_resnet50_v1.5_tf 224x224 8.19 350 3406.26
33 mlperf_ssd_resnet34_tf 1200x1200 433 350 72.2052
34 mobilenet_1_0_224_tf2 224x224 1.1 350 5222.53
35 mobilenet_edge_0_75_tf 224x224 0.62 350 4813.85
36 mobilenet_edge_1_0_tf 224x224 0.99 350 4298.57
37 mobilenet_v1_0_25_128_tf 128x128 0.027 350 22781.30
38 mobilenet_v1_0_5_160_tf 160x160 0.15 350 11945.00
39 mobilenet_v1_1_0_224_tf 224x224 1.1 350 5224.42
40 mobilenet_v2 224x224 0.6 350 3752.71
41 mobilenet_v2_1_0_224_tf 224x224 0.6 350 3638.07
42 mobilenet_v2_1_4_224_tf 224x224 1.2 350 2842.98
43 multi_task 288x512 14.8 350 660.98
44 ofa_depthwise_res50_pt 160x160 0.9 350 3378.06
45 ofa_resnet50_0_9B_pt 176x176 1.246 350 4050.59
46 openpose_pruned_0_3 368x368 49.9 350 132.81
49 person-orientation_pruned_558m_pt 176x80 0.558 350 9541.32
47 personreid-res18_pt 176x80 1.1 350 8172.30
48 personreid-res50_pt 256x128 5.4 350 3750.77
50 plate_detection 320x320 0.49 350 6612.75
51 plate_num 96x288 1.75 350 2759.17
52 pmg_pt 224x224 2.28 350 3477.35
53 refinedet_baseline 480x360 123 350 234.10
54 RefineDet-Medical_EDD_tf 320x320 9.8 350 1000.00
55 refinedet_pruned_0_8 360x480 25 350 513.16
56 refinedet_pruned_0_92 360x480 10.1 350 649.21
57 refinedet_pruned_0_96 360x480 5.1 350 687.19
58 refinedet_VOC_tf 320x320 81.9 350 307.83
59 reid 80x160 0.95 350 8422.35
60 resnet18 224x224 3.7 350 5185.26
61 resnet50 224x224 7.7 350 3738.35
62 resnet50_pt 224x224 4.1 350 3429.55
63 resnet50_tf2 224x224 7.7 350 3737.98
64 resnet_v1_101_tf 224x224 14.4 350 2244.88
65 resnet_v1_152_tf 224x224 21.8 350 1596.44
66 resnet_v1_50_tf 224x224 7 350 3739.72
67 retinaface 360x640 1.11 350 1627.01
68 salsanext_pt 64x2048 20.4 350 154.52
69 salsanext_v2_pt 64x2048 32 350 90.88
70 SemanticFPN_cityscapes_pt 256x512 10 350 1094.60
71 SemanticFPN_Mobilenetv2_pt 512x1024 5.4 350 216.55
72 semantic_seg_citys_tf2 512x1024 54 350 115.01
73 SESR_S_pt 360x640 7.48 350 185.58
74 sp_net 128x224 0.55 350 6753.90
75 squeezenet 227x227 0.76 350 7672.00
76 squeezenet_pt 224x224 0.82 350 7132.38
77 ssd_adas_pruned_0_95 360x480 6.3 350 714.24
78 ssd_inception_v2_coco_tf 300x300 9.6 350 240.00
79 ssdlite_mobilenet_v2_coco_tf 300x300 1.5 350 1482.77
80 ssd_mobilenet_v1_coco_tf 300x300 2.5 350 2338.31
81 ssd_mobilenet_v2 360x480 6.6 350 590.97
82 ssd_mobilenet_v2_coco_tf 300x300 3.8 350 1287.66
83 ssd_pedestrian_pruned_0_97 360x360 5.9 350 605.79
84 ssd_resnet_50_fpn_coco_tf 640x640 178.4 350 108.19
85 ssd_traffic_pruned_0_9 360x480 11.6 350 697.63
86 tiny_yolov3_vmss 416x416 5.46 350 1971.40
87 tsd_yolox_pt 640x640 73 350 263.49
88 ultrafast_pt 288x800 8.4 350 1061.32
89 unet_chaos-CT_pt 512x512 23.3 350 201.87
90 vgg_16_tf 224x224 31 350 505.57
91 vgg_19_tf 224x224 39.3 350 450.82
92 vpgnet_pruned_0_99 480x640 2.5 350 443.39
93 yolov2_voc 448x448 34 350 825.49
94 yolov2_voc_pruned_0_66 448x448 11.6 350 1208.89
95 yolov2_voc_pruned_0_71 448x448 9.9 350 1351.28
96 yolov2_voc_pruned_0_77 448x448 7.8 350 1327.28
97 yolov3_adas_pruned_0_9 256x512 5.5 350 1120.72
98 yolov3_bdd 288x512 53.7 350 320.06
99 yolov3_voc 416x416 65.4 350 391.59
100 yolov3_voc_tf 416x416 65.6 350 392.26
101 yolov4_leaky_spp_m 416x416 60.1 350 326.98
102 yolov4_leaky_spp_m_pruned_0_36 416x416 38.2 350 313.13

VCK5000 Performance with 8PE350 MHz DPUCVDX8H

The following table lists the throughput performance (in frames/sec or fps) for various neural network samples on the Versal ACAP VCK5000 Gen3x16 with DPUCVDX8H running at 8PE@350 MHz.

Table 2. VCK5000 Performance with 8PE350 MHz DPUCVDX8H
No Neural Network Input Size GOPS DPU Frequency (MHz) Performance (fps) (Multiple thread)
1 densebox_320_320 320x320 0.49 350 5972.1
2 densebox_640_360 360x640 1.1 350 3054.66
3 drunet_pt 528x608 2.59 350 204.256
4 ENet_cityscapes_pt 512x1024 8.6 350 147.573
5 face_landmark 96x72 0.14 350 20320.8
6 face-quality 80x60 0.06 350 31443.7
7 face-quality_pt 80x60 0.06 350 31639.6
8 FairMot_pt 640x480 36 350 500.229
9 fpn 256x512 8.9 350 1034.06
10 FPN_Res18_Medical_segmentation 320x320 45.3 350 554.919
11 FPN-resnet18_covid19-seg_pt 352x352 22.7 350 1176.98
12 inception_v1 224x224 3.2 350 3967.01
13 inception_v1_tf 224x224 3 350 4204.56
14 medical_seg_cell_tf2 128x128 5.3 350 1511.86
15 MLPerf_resnet50_v1.5_tf 224x224 8.19 350 4505.04
16 mlperf_ssd_resnet34_tf 1200x1200 433 350 75.9417
17 multi_task 288x512 14.8 350 715.768
18 openpose_pruned_0_3 368x368 49.9 350 170.05
19 plate_detection 320x320 0.49 350 8169.45
20 plate_num 96x288 1.75 350 2995.55
21 refinedet_baseline 480x360 123 350 285.417
22 RefineDet-Medical_EDD_tf 320x320 9.8 350 1272.02
23 refinedet_pruned_0_8 360x480 25 350 667.239
24 refinedet_pruned_0_92 360x480 10.1 350 854.083
25 refinedet_pruned_0_96 360x480 5.1 350 880.724
26 refinedet_VOC_tf 320x320 81.9 350 398.164
27 reid 80x160 0.95 350 10771.9
28 resnet18 224x224 3.7 350 6621.65
29 resnet50 224x224 7.7 350 4941.68
30 resnet50_pt 224x224 4.1 350 4529.6
31 resnet50_tf2 224x224 7.7 350 4941.54
32 resnet_v1_101_tf 224x224 14.4 350 2975
33 resnet_v1_152_tf 224x224 21.8 350 2120.41
34 resnet_v1_50_tf 224x224 7 350 4939.12
35 salsanext_pt 64x2048 20.4 350 158.533
36 salsanext_v2_pt 64x2048 32 350 87.6567
37 SemanticFPN_cityscapes_pt 256x512 10 350 1089.61
38 semantic_seg_citys_tf2 512x1024 54 350 118.018
39 SESR_S_pt 360x640 7.48 350 232.841
40 sp_net 128x224 0.55 350 8444.81
41 squeezenet 227x227 0.76 350 8768.06
42 squeezenet_pt 224x224 0.82 350 7579.4
43 ssd_adas_pruned_0_95 360x480 6.3 350 925.384
44 ssd_pedestrian_pruned_0_97 360x360 5.9 350 751.791
45 ssd_resnet_50_fpn_coco_tf 640x640 178.4 350 114.255
46 ssd_traffic_pruned_0_9 360x480 11.6 350 823.661
47 tiny_yolov3_vmss 416x416 5.46 350 2590.73
48 ultrafast_pt 288x800 8.4 350 1373.24
49 unet_chaos-CT_pt 512x512 23.3 350 246.539
50 vpgnet_pruned_0_99 480x640 2.5 350 619.569
51 yolov2_voc 448x448 34 350 961.635
52 yolov2_voc_pruned_0_66 448x448 11.6 350 1531.11
53 yolov2_voc_pruned_0_71 448x448 9.9 350 1764.26
54 yolov2_voc_pruned_0_77 448x448 7.8 350 1535.86
55 yolov3_adas_pruned_0_9 256x512 5.5 350 1441.31
56 yolov3_bdd 288x512 53.7 350 399.008
57 yolov3_voc 416x416 65.4 350 475.684
58 yolov3_voc_tf 416x416 65.6 350 477.083