ZCU102 Evaluation Kit - 3.0 English

Vitis AI Library User Guide (UG1354)

Document ID
UG1354
Release Date
2023-01-12
Version
3.0 English

The ZCU102 evaluation kit uses the mid-range ZU9 UltraScale+™ device. There are two different hardware versions of the ZCU102 evaluation kit, one with the serial number 0432055-04 as the header, and the other with the serial number 0432055-05 as the header. The performance of the Vitis AI Library varies between the two hardware versions (because of different DDR memory performance). Because the 0432055-04 version of ZCU102 has been discontinued, the following table only shows the performance for the ZCU102 (0432055-05) evaluation kit. In the ZCU102 evaluation kit, three B4096 DPU cores are implemented in the program logic and delivers 3.45 TOPS INT8 peak performance for deep learning inference acceleration.

Refer to the following table for throughput performance (in frames/sec or fps) for various neural network samples on ZCU102 (0432055-05) with DPU running at 281 MHz.

Note: The DPU on the ZCU102 has a hardware softmax acceleration module. Due to the limitation of the hardware softmax module, the software softmax is faster when the number of categories reaches 1000. Set XLNX_ENABLE_C_SOFTMAX=1 to enable the software softmax: softmax_c. The default value of XLNX_ENABLE_C_SOFTMAX is 0, which means the softmax method is selected according to the following priorities:
  1. Neon Acceleration
  2. Hardware Softmax
  3. Software Softmax_c

For ZCU102, use the following command to test the performance of classification.

env XLNX_ENABLE_C_SOFTMAX=1 ./test_performance_classification resnet50 test_performance_classification.list -t 8 -s 60
Table 1. ZCU102 (0432055-05) Performance
No Neural Network Input Size GOPS Performance (fps) (Single thread) Performance (fps) (Multiple thread)
1 bcc_pt 800x1000 268.9 3.3 10.9
2 bevdet 256x704 407.6 1.6 5.6
3 c2d2_lite 512x512 6.86 2.9 5.7
4 centerpoint 2560x40x4 54 16 48.2
5 cflownet_pt 128x128 5.21 62.5 187.7
6 chen_color_resnet18_pt 224x224 3.627 209.2 561.8
7 clocs 12000x100x4 41 2.9 10.5
8 drunet_pt 528x608 2.59 60.7 190.6
9 efficientdet_d2_tf 768x768 11.06 3.5 6.8
10 efficientnet_lite_tf2 224x224 0.77 202.3 579.1
11 efficientnet-b0_tf2 224x224 0.36 77.9 156.5
12 efficientNet-edgetpu-L_tf 300x300 19.36 35 91.5
13 efficientNet-edgetpu-M_tf 240x240 7.34 80 210.7
14 efficientNet-edgetpu-S_tf 224x224 4.72 115.1 319
15 ENet_cityscapes_pt 512x1024 8.6 10.1 37.7
16 face_mask_detection_pt 512x512 0.593 115.3 399
17 face-quality_pt 80x60 0.06 2886 8970.2
18 facerec-resnet20_mixed_pt 112x96 3.5 168.8 342.9
19 facereid-large_pt 96x96 0.5 1076.1 2873.8
20 facereid-small_pt 80x80 0.09 2383.5 7317.3
21 fadnet 576x960 441 1.2 1.7
22 fadnet_pruned 576x960 154 1.8 2.7
23 fadnet_v2_pt 576x960 412 1.5 2.4
24 fadnet_v2_pruned_pt 576x960 201 2.3 4.5
25 FairMot_pt 640x480 36 22.7 67.1
26 FPN-resnet18_covid19-seg_pt 352x352 22.7 37.2 108.6
27 HardNet_MSeg_pt 352x352 22.78 24.7 57.8
28 hfnet_tf 960x960 20.09 3.6 16.4
29 HRNet_pt 1024x2048 1511.9 0.6 0.6
30 inception_resnet_v2_tf 299x299 26.4 24 54.4
31 inception_v1_tf 224x224 3 192.5 503.6
32 inception_v2_tf 224x224 3.88 93.5 247.6
33 inception_v3_pt 299x299 5.7 60.5 146.8
34 inception_v3_tf 299x299 11.5 60.6 147
35 inception_v3_tf2 299x299 11.5 59.9 147.5
36 inception_v4_2016_09_09_tf 299x299 24.6 29 71.3
37 medical_seg_cell_tf2 128x128 5.3 156.7 437.1
38 MLPerf_resnet50_v1.5_tf 224x224 8.19 80.5 190.3
39 mlperf_ssd_resnet34_tf 1200x1200 433 1.9 7.1
40 mobilenet_1_0_224_tf2 224x224 1.1 325.1 1020.7
41 mobilenet_edge_0_75_tf 224x224 0.62 262.7 735.4
42 mobilenet_edge_1_0_tf 224x224 0.99 217 597.7
43 mobilenet_v1_0_25_128_tf 128x128 0.027 1299.3 4788.9
44 mobilenet_v1_0_5_160_tf 160x160 0.15 903 3439.4
45 mobilenet_v1_1_0_224_tf 224x224 1.1 330.4 1038.1
46 mobilenet_v2_1_0_224_tf 224x224 0.6 270 764.5
47 mobilenet_v2_1_4_224_tf 224x224 1.2 191.4 504.1
48 mobilenet_v2_cityscapes_tf 1024x2048 132.74 1.7 5.3
49 mobilenet_v3_small_1_0_tf2 224x224 0.132 343.9 1074.9
50 monodepth2_pt 192x640 257.21 43.5 121.9
51 movenet_ntd_pt 192x192 0.5 94.1 391.9
52 MT-resnet18_mixed_pt 512x320 13.65 32.9 109.2
53 multi_task_v3_pt 320x512 25.44 17.1 64.5
54 ocr_pt 960x960 875.7 1.1 3.4
55 ofa_depthwise_res50_pt 176x176 1.25 106.2 379.4
56 ofa_rcan_latency_pt 360x640 45.7 17 28.1
57 ofa_resnet50_0_9B_pt 160x160 1.8 185.4 370.8
58 ofa_yolo_pruned_0_30_pt 640x640 34.71 21.5 56
59 ofa_yolo_pruned_0_50_pt 640x640 24.62 27.7 72.6
60 ofa_yolo_pt 640x640 48.88 16.9 44
61 person-orientation_pruned_558m_pt 224x112 0.558 776.5 1625.2
62 personreid-res18_pt 176x80 1.1 429.7 850.3
63 personreid-res50_pt 256x128 5.3 99.1 264.2
64 pmg_pt 224x224 2.28 155.9 401.3
65 pointpainting 40000x64x16 112 1.3 4.3
66 pointpillars_kitti_12000_pt 12000x100x4 10.8 19.7 51.7
67 pointpillars_nuscenes 40000x64x5 108 2.2 9.7
68 rcan_pruned_tf 360x640 86.95 9.3 19.3
69 refinedet_VOC_tf 320x320 81.9 11.3 35.4
70 RefineDet-Medical_EDD_tf 320x320 9.8 67.5 229.9
71 resnet_v1_101_tf 224x224 14.4 47.1 122.6
72 resnet_v1_152_tf 224x224 21.8 32 84.7
73 resnet_v1_50_tf 224x224 7 90.3 214.1
74 resnet_v2_101_tf 299x299 26.78 23.6 57.1
75 resnet_v2_152_tf 299x299 40.47 16.1 38.9
76 resnet_v2_50_tf 299x299 13.1 45.1 102.5
77 resnet50_pt 224x224 4.1 79.9 189.8
78 resnet50_tf2 224x224 7.7 89.2 213.6
79 SA_gate_base_pt 360x360 178 3.3 9.7
80 salsanext_pt 64x2048 20.4 9.5 42.5
81 salsanext_v2_pt 64x2048 32 6 14.4
82 semantic_seg_citys_tf2 512x1024 54 7.4 24.5
83 SemanticFPN_cityscapes_pt 256x512 10 35.5 172.9
84 SemanticFPN_Mobilenetv2_pt 512x1024 5.4 10.5 54.6
85 SESR_S_pt 360x640 7.48 88 142.2
86 solo_pt 640x640 107 1.4 5
87 squeezenet_pt 224x224 0.82 572.1 1527.7
88 ssd_inception_v2_coco_tf 300x300 9.6 39.8 108.5
89 ssd_mobilenet_v1_coco_tf 300x300 2.5 111.9 365.1
90 ssd_mobilenet_v2_coco_tf 300x300 3.8 82.1 224
91 ssd_resnet_50_fpn_coco_tf 640x640 178.4 2.9 5.3
92 ssdlite_mobilenet_v2_coco_tf 300x300 1.5 105.9 328.3
93 ssr_pt 256x256 39.72 6 15.1
94 superpoint_tf 480x640 52.4 12.5 53.3
95 textmountain_pt 960x960 575.2 1.7 4.7
96 tsd_yolox_pt 640x640 73 13.1 34.4
97 ultrafast_pt 288x800 8.4 36.2 102.5
98 unet_chaos-CT_pt 512x512 23.3 22.8 70.2
99 vehicle_make_resnet18_pt 224x224 3.627 211.7 560.9
100 vehicle_type_resnet18_pt 224x224 3.627 212.2 561
101 vgg_16_tf 224x224 31 20.5 43.4
102 vgg_19_tf 224x224 39.3 17.6 38.9
103 xilinxSR_pt 360x640x3 182.44 2.3 2.3
104 yolov3_coco_416_tf2 416x416 65.9 13.3 37.5
105 yolov3_voc_tf 416x416 65.6 13.6 37.8
106 yolov4_csp_pt 640x640 121 7.4 20.2
107 yolov4_leaky_416_tf 416x416 60.3 13.5 35.8
108 yolov4_leaky_512_tf 512x512 91.2 10.2 26.4
109 yolov5_large_pt 640x640 109.6 8.7 24
110 yolov5_nano_pt 640x640 4.6 73.2 201.6
111 yolov5s6_pt 640x640 17 10.7 26
112 yolov6m_pt 640x640 82.2 6.2 27.4
113 yolox_nano_pt 416x416x3 1 185.9 539.6