面向 AI 推断的 VCK5000 Versal 开发卡 - 2.5 简体中文

Vitis AI Library 用户指南 (UG1354)

Document ID
UG1354
Release Date
2022-06-15
Version
2.5 简体中文

VCK5000 开发卡是基于赛灵思 7 nm Versal ACAP 架构专为需高吞吐量 AI 推断和高性能信号处理计算能力的设计而构建的。此版本使用 AI 引擎来实现 batch=4、batch=6 和 batch=8 的 DPU 核。理论上,每个处理引擎均可提供 10 TOPS INT8 峰值性能用于深度学习推断,AI 引擎编译器效率损失后可提供约 7.5 TOPS INT8 峰值性能。

含 4PE 350 MHz DPUCVDX8H 的 VCK5000 性能

下表列出了含 DPUCVDX8H(运行频率为 4PE@350 MHz)的 Versal ACAP VCK5000 Gen3x16 上各种神经网络采样的吞吐量性能(以每秒帧数或 fps 为单位)。

表 1. 含 4PE 350 MHz DPUCVDX8H 的 VCK5000 性能
编号 神经网络 输入大小 GOPS DPU 频率 (MHz) 性能 (fps)(多线程)
1 bcc_pt 800x1000 268.9 350 49.0175
2 centerpoint_0_ptcenterpoint_1_pt 2560x40x4 54 350 146.98
3 densebox_320_320 320x320 0.49 350 3083.38
4 densebox_640_360 360x640 1.1 350 1580.07
5 efficientnet-b0_tf2 224x224 0.36 350 426.838
6 efficientNet-edgetpu-L_tf 300x300 19.36 350 393.635
7 efficientNet-edgetpu-M_tf 240x240 7.34 350 974.597
8 efficientNet-edgetpu-S_tf 224x224 4.72 350 1663.81
9 ENet_cityscapes_pt 512x1024 8.6 350 67.3528
10 face_landmark 96x72 0.14 350 8685.99
11 face-quality 80x60 0.06 350 16646.7
12 face-quality_pt 80x60 0.06 350 16649.7
13 facerec_resnet20 112x96 3.5 350 2956.24
14 facerec_resnet64 112x96 11 350 1509.6
15 facerec-resnet20_mixed_pt 112x96 3.5 350 2954.23
16 facereid-large_pt 96x96 0.5 350 11815.2
17 facereid-small_pt 80x80 0.09 350 17826.5
18 fpn 256x512 8.9 350 393.148
19 FPN_Res18_Medical_segmentation 320x320 45.3 350 153.265
20 FPN-resnet18_covid19-seg_pt 352x352 22.7 350 612.502
21 FPN-resnet18_Endov 240x320 13.75 350 272.666
22 hourglass-pe_mpii 256x256 10.2 350 518.868
23 inception_resnet_v2_tf 299x299 26.4 350 307.795
24 inception_v1 224x224 3.2 350 1507.29
25 inception_v1_tf 224x224 3 350 1585.16
26 inception_v2 224x224 4 350 1254.56
27 inception_v2_tf 224x224 3.88 350 210.043
28 inception_v3 299x299 11.4 350 499.167
29 inception_v3_pt 299x299 5.7 350 494.729
30 inception_v3_tf 299x299 11.5 350 496.163
31 inception_v3_tf2 299x299 11.5 350 490.31
32 inception_v4 299x299 24.5 350 283.49
33 inception_v4_2016_09_09_tf 299x299 24.6 350 280.194
34 medical_seg_cell_tf2 128x128 5.3 350 914.61
35 MLPerf_resnet50_v1.5_tf 224x224 8.19 350 1638.84
36 mlperf_ssd_resnet34_tf 1200x1200 433 350 49.4214
37 mobilenet_1_0_224_tf2 224x224 1.1 350 5864.58
38 mobilenet_edge_0_75_tf 224x224 0.62 350 3344.23
39 mobilenet_edge_1_0_tf 224x224 0.99 350 3101.21
40 mobilenet_v1_0_25_128_tf 128x128 0.027 350 19353.8
41 mobilenet_v1_0_5_160_tf 160x160 0.15 350 13349.2
42 mobilenet_v1_1_0_224_tf 224x224 1.1 350 6561.03
43 mobilenet_v2 224x224 0.6 350 4302.19
44 mobilenet_v2_1_0_224_tf 224x224 0.6 350 4058.7
45 mobilenet_v2_1_4_224_tf 224x224 1.2 350 3137.54
46 mobilenet_v2_cityscapes_tf 1024x2048 132.74 350 16.2875
47 MT-resnet18_mixed_pt 512x320 13.65 350 279.933
48 multi_task 288x512 14.8 350 451.49
49 multi_task_v3_pt 320x512 25.44 350 136.757
50 openpose_pruned_0_3 368x368 49.9 350 87.143
51 personreid-res18_pt 176x80 1.1 350 4782.06
52 personreid-res50_pt 256x128 5.4 350 2012.23
53 plate_detection 320x320 0.49 350 4547.23
54 plate_num 96x288 1.75 350 1541.73
55 pmg_pt 224x224 2.28 350 2000.55
56

pointpainting_nuscenes_pt

40000x64x16 112 350 20.378
57

pointpillars_kitti_pt

12000x100x4 10.8 350 3.12967
58

pointpillars_nuscenes_pt

40000x64x5 108 350 39.0829
59 rcan_pruned_tf 360x640 86.95 350 35.2752
60 refinedet_baseline 480x360 123 350 153.091
61 refinedet_pruned_0_8 360x480 25 350 298.829
62 refinedet_pruned_0_92 360x480 10.1 350 361.42
63 refinedet_pruned_0_96 360x480 5.1 350 376.573
64 refinedet_VOC_tf 320x320 81.9 350 200.903
65 RefineDet-Medical_EDD_tf 320x320 9.8 350 555.168
66 reid 80x160 0.95 350 5045.43
67 resnet_v1_101_tf 224x224 14.4 350 1146.44
68 resnet_v1_152_tf 224x224 21.8 350 825.95
69 resnet_v1_50_tf 224x224 7 350 1817.21
70 resnet_v2_101_tf 299x299 26.78 350 245.955
71 resnet_v2_152_tf 299x299 40.47 350 172.526
72 resnet_v2_50_tf 299x299 13.1 350 427.177
73 resnet18 224x224 3.7 350 2727.74
74 resnet50 224x224 7.7 350 1817.78
75 resnet50_pt 224x224 4.1 350 1675.05
76 resnet50_tf2 224x224 7.7 350 1533.14
77 retinaface 360x640 1.11 350 1621.49
78 SA_gate_base_pt 360x360 187 350 11.2804
79 salsanext_pt 64x2048 20.4 350 108.731
80 salsanext_v2_pt 64x2048 32 350 60.4578
81 semantic_seg_citys_tf2 512x1024 54 350 63.7
82 SemanticFPN_cityscapes_pt 256x512 10 350 730.999
83 SemanticFPN_Mobilenetv2_pt 512x1024 5.4 350 207.713
84 sp_net 128x224 0.55 350 3115.87
85 squeezenet 227x227 0.76 350 3457.99
86 squeezenet_pt 224x224 0.82 350 3375.85
87 ssd_adas_pruned_0_95 360x480 6.3 350 386.074
88 ssd_inception_v2_coco_tf 300x300 9.6 350 105.8
89 ssd_mobilenet_v1_coco_tf 300x300 2.5 350 2286.02
90 ssd_mobilenet_v2 360x480 6.6 350 638.548
91 ssd_mobilenet_v2_coco_tf 300x300 3.8 350 1291.5
92 ssd_pedestrian_pruned_0_97 360x640 5.9 350 317.014
93 ssd_resnet_50_fpn_coco_tf 640x640 178.4 350 80.7027
94 ssd_traffic_pruned_0_9 360x480 11.6 350 379.248
95 ssdlite_mobilenet_v2_coco_tf 300x300 1.5 350 1875.26
96 tiny_yolov3_vmss 416x416 5.46 350 557.548
97 unet_chaos-CT_pt 512x512 23.3 350 128.55
98 vgg_16_tf 224x224 31 350 328.507
99 vgg_19_tf 224x224 39.3 350 293.448
100 vpgnet_pruned_0_99 480x640 2.5 350 232.013
101 yolov2_voc 448x448 34 350 332.582
102 yolov2_voc_pruned_0_66 448x448 11.6 350 411.454
103 yolov2_voc_pruned_0_71 448x448 9.9 350 431.613
104 yolov2_voc_pruned_0_77 448x448 7.8 350 437.692
105 yolov3_adas_pruned_0_9 256x512 5.5 350 723.208
106 yolov3_bdd 288x512 53.7 350 228.916
107 yolov3_voc 416x416 65.4 350 279.206
108 yolov3_voc_tf 416x416 65.6 350 279.332
109 yolov4_leaky_spp_m 416x416 60.1 350 242.952
110 yolov4_leaky_spp_m_pruned_0_36 416x416 38.2 350 234.6
111 ultrafast_pt 288x800 8.4 350 592.002
112 ocr_pt 960x960 875.7 350 24.2049
113 HardNet_MSeg_pt 352x352 22.78 350 158.085
114 drunet_pt 528x608 2.59 350 94.1445
115 person-orientation_pruned_558m_pt 224x112 0.558 350 5444.55
116 ofa_resnet50_0_9B_pt 160x160 0.9 350 2264.41
117 SESR_S_pt 360x640 7.48 350 113.793
118 c2d2_lite 512x512 6.86 350 22.4765
119 ofa_depthwise_res50_pt 176x176 1.25 350 2948.67
120 FairMot_pt 640x480 36 350 277.874
121 mobilenet_v3_small_1_0_tf2 224x224 0.132 350 1963.18
122 clocs 12000x100x4 41 350 11.7131
123 tsd_yolox_pt 640x640 73 350 192.576
124 fadnet_pruned 576x960 154 350 10.0451
125 ssr_pt 256x256 39.72 350 70.7821
126 fadnet 576x960 441 350 9.60453
127 solo_pt 640x640 107 350 24.7739
128 chen_color_resnet18_pt 224x224 3.627 350 2839.65
129 face_mask_detection_pt 512x512 0.593 350 770.836
130 ofa_rcan_latency_pt 360x640 45.7 350 33.3203
131 textmountain_pt 960x960 575.2 350 30.8099
132 vehicle_make_resnet18_pt 224x224 3.627 350 2830.61
133 vehicle_type_resnet18_pt 224x224 3.627 350 2840.23
134 ofa_yolo_pt 640x640 48.88 350 220.236
135 ofa_yolo_pruned_0_30_pt 640x640 34.71 350 276.05
136 ofa_yolo_pruned_0_50_pt 640x640 24.62 350 318.732
137 yolov3-coco_tf2 416x416 65.9 350 276.699
138 movenet_ntd_pt 192x192 0.5 350 2750.94
139 yolov4_416_tf 416x416 60.3 350 240.098
140 yolov4_512_tf 512x512 91.2 350 137.829

含 6PE 350 MHz DPUCVDX8H-aieDWC 的 VCK5000 性能

下表列出了含 DPUCVDX8H-aieDWC(运行频率为 6PE@350 MHz)的 Versal ACAP VCK5000 Gen3x16 上各种神经网络采样的吞吐量性能(以每秒帧数或 fps 为单位)。

表 2. 含 6PE 350 MHz DPUCVDX8H-aieDWC 的 VCK5000 性能
编号 神经网络 输入大小 GOPS DPU 频率 (MHz) 性能 (fps)(多线程)
1 bcc_pt 800x1000 268.9 350 79.0952
2 c2d2_lite 512x512 6.86 350 36.7298
3 densebox_320_320 320x320 0.49 350 4588.44
4 densebox_640_360 360x640 1.1 350 2346.24
5 efficientNet-edgetpu-M_tf 240x240 7.34 350 1377.58
6 efficientNet-edgetpu-S_tf 224x224 4.72 350 2488.34
7 ENet_cityscapes_pt 512x1024 8.6 350 141.03
8 face_landmark 96x72 0.14 350 22440.6
9 face-quality 80x60 0.06 350 31413
10 face-quality_pt 80x60 0.06 350 30992.5
11 facerec_resnet20 112x96 3.5 350 4409.6
12 facerec_resnet64 112x96 11 350 2261.3
13 facerec-resnet20_mixed_pt 112x96 3.5 350 4411.13
14 facereid-large_pt 96x96 0.5 350 20918.3
15 facereid-small_pt 80x80 0.09 350 32349.6
16 fpn 256x512 8.9 350 949.618
17 FPN_Res18_Medical_segmentation 320x320 45.3 350 426.442
18 FPN-resnet18_covid19-seg_pt 352x352 22.7 350 958.147
19 FPN-resnet18_Endov 240x320 13.75 350 399.36
20 hourglass-pe_mpii 256x256 10.2 350 727.146
21 inception_resnet_v2_tf 299x299 26.4 350 491.966
22 inception_v1 224x224 3.2 350 3264.81
23 inception_v1_tf 224x224 3 350 3500.3
24 inception_v2 224x224 4 350 2598.33
25 inception_v2_tf 224x224 3.88 350 330.521
26 inception_v3 299x299 11.4 350 911.529
27 inception_v3_pt 299x299 5.7 350 909.003
28 inception_v3_tf 299x299 11.5 350 914.349
29 inception_v3_tf2 299x299 11.5 350 963.511
30 inception_v4 299x299 24.5 350 504.202
31 inception_v4_2016_09_09_tf 299x299 24.6 350 503.536
32 medical_seg_cell_tf2 128x128 5.3 350 1352.53
33 MLPerf_resnet50_v1.5_tf 224x224 8.19 350 3403.95
34 mlperf_ssd_resnet34_tf 1200x1200 433 350 73.2746
35 mobilenet_1_0_224_tf2 224x224 1.1 350 7820.25
36 mobilenet_edge_0_75_tf 224x224 0.62 350 5689.7
37 mobilenet_edge_1_0_tf 224x224 0.99 350 5176.39
38 mobilenet_v1_0_25_128_tf 128x128 0.027 350 20510
39 mobilenet_v1_0_5_160_tf 160x160 0.15 350 14825.3
40 mobilenet_v1_1_0_224_tf 224x224 1.1 350 8058.3
41 mobilenet_v2 224x224 0.6 350 6928.44
42 mobilenet_v2_1_0_224_tf 224x224 0.6 350 6504.78
43 mobilenet_v2_1_4_224_tf 224x224 1.2 350 4971.92
44 mobilenet_v2_cityscapes_tf 1024x2048 132.7 350 18.4175
45 MT-resnet18_mixed_pt 512x320 13.65 350 421.09
46 multi_task 288x512 14.8 350 659.953
47 multi_task_v3_pt 320x512 25.44 350 203.766
48 ofa_depthwise_res50_pt 176x176 1.25 350 3237.38
49 ofa_yolo_pruned_0_30_pt 640x640 34.71 350 376.219
50 ofa_yolo_pruned_0_50_pt 640x640 24.62 350 441.708
51 openpose_pruned_0_3 368x368 49.9 350 132.641
52 personreid-res18_pt 176x80 1.1 350 8047.31
53 personreid-res50_pt 256x128 5.4 350 3750.97
54 plate_detection 320x320 0.49 350 6612.51
55 plate_num 96x288 1.75 350 2669.07
56 pmg_pt 224x224 2.28 350 3423.98
57

pointpainting_nuscenes_pt

40000x64x16 112 350 18.0945
58

pointpillars_nuscenes_pt

40000x64x5 108 350 37.102
59 rcan_pruned_tf 360x640 86.95 350 53.4131
60 refinedet_baseline 480x360 123 350 233.959
61 refinedet_pruned_0_8 360x480 25 350 513.399
62 refinedet_pruned_0_92 360x480 10.1 350 648.891
63 refinedet_pruned_0_96 360x480 5.1 350 686.944
64 refinedet_VOC_tf 320x320 81.9 350 307.637
65 RefineDet-Medical_EDD_tf 320x320 9.8 350 998.943
66 reid 80x160 0.95 350 8310.57
67 resnet_v1_101_tf 224x224 14.4 350 2244.46
68 resnet_v1_152_tf 224x224 21.8 350 1598.02
69 resnet_v1_50_tf 224x224 7 350 3736.48
70 resnet18 224x224 3.7 350 5134.89
71 resnet50 224x224 7.7 350 3738.38
72 resnet50_pt 224x224 4.1 350 3432.18
73 resnet50_tf2 224x224 7.7 350 3152.91
74 retinaface 360x640 1.11 350 1929.07
75 salsanext_pt 64x2048 20.4 350 158.088
76 salsanext_v2_pt 64x2048 32 350 91.2235
77 semantic_seg_citys_tf2 512x1024 54 350 114.931
78 SemanticFPN_cityscapes_pt 256x512 10 350 1049.65
79 SemanticFPN_Mobilenetv2_pt 512x1024 5.4 350 234.421
80 SESR_S_pt 360x640 7.48 350 185.874
81 sp_net 128x224 0.55 350 6726.33
82 squeezenet 227x227 0.76 350 7476.08
83 squeezenet_pt 224x224 0.82 350 7062.16
84 ssd_adas_pruned_0_95 360x480 6.3 350 706.973
85 ssd_inception_v2_coco_tf 300x300 9.6 350 166.214
86 ssd_mobilenet_v1_coco_tf 300x300 2.5 350 2471.66
87 ssd_mobilenet_v2 360x480 6.6 350 803.248
88 ssd_mobilenet_v2_coco_tf 300x300 3.8 350 1879.99
89 ssd_pedestrian_pruned_0_97 360x640 5.9 350 606.687
90 ssd_resnet_50_fpn_coco_tf 640x640 178.4 350 108.15
91 ssd_traffic_pruned_0_9 360x480 11.6 350 691.374
92 ssdlite_mobilenet_v2_coco_tf 300x300 1.5 350 2552.13
93 tiny_yolov3_vmss 416x416 5.46 350 1968.06
94 unet_chaos-CT_pt 512x512 23.3 350 201.477
95 vgg_16_tf 224x224 31 350 500.503
96 vgg_19_tf 224x224 39.3 350 446.076
97 vpgnet_pruned_0_99 480x640 2.5 350 443.372
98 yolov2_voc 448x448 34 350 821.262
99 yolov2_voc_pruned_0_66 448x448 11.6 350 1205.87
100 yolov2_voc_pruned_0_71 448x448 9.9 350 1343.89
101 yolov2_voc_pruned_0_77 448x448 7.8 350 1323.1
102 yolov3_adas_pruned_0_9 256x512 5.5 350 1118.53
103 yolov3_bdd 288x512 53.7 350 319.134
104 yolov3_voc 416x416 65.4 350 389.429
105 yolov3_voc_tf 416x416 65.6 350 389.708
106 yolov4_leaky_spp_m 416x416 60.1 350 324.065
107 yolov4_leaky_spp_m_pruned_0_36 416x416 38.2 350 312.385
108 ultrafast_pt 288x800 8.4 350 1039.88
109 ocr_pt 960x960 875.7 350 11.7934
110 drunet_pt 528x608 2.59 350 153.309
111 person-orientation_pruned_558m_pt 224x112 0.558 350 9448
112 ofa_resnet50_0_9B_pt 160x160 0.9 350 4013.12
113 FairMot_pt 640x480 36 350 426.194
114 tsd_yolox_pt 640x640 73 350 261.897
115 fadnet_pruned 576x960 154 350 10.346
116 ssr_pt 256x256 39.72 350 90.0609
117 fadnet 576x960 441 350 9.23075
118 chen_color_resnet18_pt 224x224 3.627 350 5301.04
119 face_mask_detection_pt 512x512 0.593 350 1335.01
120 ofa_rcan_latency_pt 360x640 45.7 350 50.141
121 textmountain_pt 960x960 575.2 350 41.828
122 vehicle_make_resnet18_pt 224x224 3.627 350 5284.14
123 vehicle_type_resnet18_pt 224x224 3.627 350 5300.12
124 ofa_yolo_pt 640x640 48.88 350 285.311
125 movenet_ntd_pt 192x192 0.5 350 2267.16
126 yolov3-coco_tf2 416x416 65.9 350 385.253

含 6PE 350 MHz DPUCVDX8H-aieMISC 的 VCK5000 性能

下表列出了含 DPUCVDX8H-aieMISC(运行频率为 6PE@350 MHz)的 Versal ACAP VCK5000 Gen3x16 上各种神经网络采样的吞吐量性能(以每秒帧数或 fps 为单位)。

表 3. 含 6PE 350 MHz DPUCVDX8H-aieMISC 的 VCK5000 性能
编号 神经网络 输入大小 GOPS DPU 频率 (MHz) 性能 (fps)(多线程)
1

centerpoint_0_pt

centerpoint_1_pt
2560x40x4 54 350 10.832
2 densebox_320_320 320x320 0.49 350 4595.14
3 densebox_640_360 360x640 1.1 350 2343.7
4 ENet_cityscapes_pt 512x1024 8.6 350 91.5134
5 face_landmark 96x72 0.14 350 12890.3
6 face-quality 80x60 0.06 350 24403.4
7 face-quality_pt 80x60 0.06 350 24426.3
8 facerec_resnet20 112x96 3.5 350 4398.58
9 facerec_resnet64 112x96 11 350 2250.76
10 facerec-resnet20_mixed_pt 112x96 3.5 350 4397.97
11 facereid-large_pt 96x96 0.5 350 17183.6
12 facereid-small_pt 80x80 0.09 350 25900.5
13 fpn 256x512 8.9 350 558.541
14 FPN_Res18_Medical_segmentation 320x320 45.3 350 225.73
15 FPN-resnet18_covid19-seg_pt 352x352 22.7 350 880.958
16 inception_resnet_v2_tf 299x299 26.4 350 448.058
17 inception_v1 224x224 3.2 350 2122.92
18 inception_v1_tf 224x224 3 350 2224.44
19 inception_v2 224x224 4 350 1761.02
20 inception_v3 299x299 11.4 350 713.317
21 inception_v3_pt 299x299 5.7 350 710.52
22 inception_v3_tf 299x299 11.5 350 711.533
23 inception_v3_tf2 299x299 11.5 350 719.181
24 inception_v4 299x299 24.5 350 398.407
25 inception_v4_2016_09_09_tf 299x299 24.6 350 397.711
26 medical_seg_cell_tf2 128x128 5.3 350 1294.82
27 MLPerf_resnet50_v1.5_tf 224x224 8.19 350 2454.78
28 mlperf_ssd_resnet34_tf 1200x1200 433 350 67.6523
29 multi_task 288x512 14.8 350 629.198
30 openpose_pruned_0_3 368x368 49.9 350 128.641
31 personreid-res18_pt 176x80 1.1 350 7050.56
32 personreid-res50_pt 256x128 5.4 350 3012.43
33 plate_detection 320x320 0.49 350 6612.83
34 plate_num 96x288 1.75 350 2018.92
35 pmg_pt 224x224 2.28 350 2887.85
36

pointpainting_nuscenes_pt

40000x64x16 112 350 15.5491
37

pointpillars_kitti_pt

12000x100x4 10.8 350 3.09882
38

pointpillars_nuscenes_pt

40000x64x5 108 350 34.4647
39 rcan_pruned_tf 360x640 87 350 52.8033
40 refinedet_baseline 480x360 123 350 220.563
41 refinedet_pruned_0_8 360x480 25 350 438.858
42 refinedet_pruned_0_92 360x480 10.1 350 532.618
43 refinedet_pruned_0_96 360x480 5.1 350 559.013
44 refinedet_VOC_tf 320x320 81.9 350 292.194
45 RefineDet-Medical_EDD_tf 320x320 9.8 350 828.599
46 reid 80x160 0.95 350 7429.52
47 resnet_v1_101_tf 224x224 14.4 350 1715.8
48 resnet_v1_152_tf 224x224 21.8 350 1237.54
49 resnet_v1_50_tf 224x224 7 350 2718.97
50 resnet18 224x224 3.7 350 4013.81
51 resnet50 224x224 7.7 350 2722.39
52 resnet50_pt 224x224 4.1 350 2507.43
53 resnet50_tf2 224x224 7.7 350 2306.45
54 salsanext_pt 64x2048 20.4 350 152.583
55 salsanext_v2_pt 64x2048 32 350 83.7139
56 semantic_seg_citys_tf2 512x1024 54 350 80.988
57 SemanticFPN_cityscapes_pt 256x512 10 350 992.311
58 sp_net 128x224 0.55 350 4502.57
59 squeezenet 227x227 0.76 350 4752.42
60 squeezenet_pt 224x224 0.82 350 4617.91
61 ssd_adas_pruned_0_95 360x480 6.3 350 572.152
62 ssd_pedestrian_pruned_0_97 360x640 5.9 350 468.949
63 ssd_resnet_50_fpn_coco_tf 640x640 178 350 105.051
64 ssd_traffic_pruned_0_9 360x480 11.6 350 564.056
65 tiny_yolov3_vmss 416x416 5.46 350 835.482
66 unet_chaos-CT_pt 512x512 23.3 350 185.083
67 vgg_16_tf 224x224 31 350 478.678
68 vgg_19_tf 224x224 39.3 350 428.678
69 vpgnet_pruned_0_99 480x640 2.5 350 347.45
70 yolov2_voc 448x448 34 350 491.411
71 yolov2_voc_pruned_0_66 448x448 11.6 350 608.783
72 yolov2_voc_pruned_0_71 448x448 9.9 350 641.532
73 yolov2_voc_pruned_0_77 448x448 7.8 350 635.995
74 yolov3_adas_pruned_0_9 256x512 5.5 350 1048.86
75 yolov3_bdd 288x512 53.7 350 316.456
76 yolov3_voc 416x416 65.4 350 386.959
77 yolov3_voc_tf 416x416 65.6 350 386.939
78 yolov4_leaky_spp_m 416x416 60.1 350 318.688
79 yolov4_leaky_spp_m_pruned_0_36 416x416 38.2 350 308.202
80 ultrafast_pt 288x800 8.4 350 870.674
81 ocr_pt 960x960 876 350 33.8452
82 drunet_pt 528x608 2.59 350 141.072
83 person-orientation_pruned_558m_pt 224x112 0.56 350 8060.23
84 ofa_resnet50_0_9B_pt 160x160 0.9 350 3305.77
85 SESR_S_pt 360x640 7.48 350 170.614
86 FairMot_pt 640x480 36 350 381.481
87 clocs 12000x100x4 41 350 17.0951
88 tsd_yolox_pt 640x640 73 350 257.268
89 fadnet 576x960 441 350 8.90566
90 solo_pt 640x640 107 350 24.1319
91 chen_color_resnet18_pt 224x224 3.63 350 4166.42
92 ofa_rcan_latency_pt 360x640 45.7 350 49.919
93 textmountain_pt 960x960 575 350 40.6086
94 vehicle_make_resnet18_pt 224x224 3.63 350 4161
95 vehicle_type_resnet18_pt 224x224 3.63 350 4168.73
96 ofa_yolo_pt 640x640 48.9 350 279.768
97 ofa_yolo_pruned_0_30_pt 640x640 34.7 350 366.232
98 ofa_yolo_pruned_0_50_pt 640x640 24.6 350 424.637
99 yolov3-coco_tf2 416x416 65.9 350 382.299
100 yolov4_416_tf 416x416 60.3 350 315.308
101 yolov4_512_tf 512x512 91.2 350 171.517

含 8PE 350 MHz DPUCVDX8H 的 VCK5000 性能

下表列出了含 DPUCVDX8H(运行频率为 8PE@350 MHz)的 Versal ACAP VCK5000 Gen3x16 上各种神经网络采样的吞吐量性能(以每秒帧数或 fps 为单位)。

表 4. 含 8PE350 MHz DPUCVDX8H 的 VCK5000 性能
编号 神经网络 输入大小 GOPS DPU 频率 (MHz) 性能 (fps)(多线程)
1 densebox_320_320 320x320 0.49 350 5973.3
2 densebox_640_360 360x640 1.1 350 3056.58
3 ENet_cityscapes_pt 512x1024 8.6 350 148.317
4 face_landmark 96x72 0.14 350 20304.4
5 face-quality 80x60 0.06 350 31247.3
6 face-quality_pt 80x60 0.06 350 31080.9
7 fpn 256x512 8.9 350 1064.08
8 FPN_Res18_Medical_segmentation 320x320 45.3 350 557.245
9 FPN-resnet18_covid19-seg_pt 352x352 22.7 350 1177
10 inception_v1 224x224 3.2 350 3980.48
11 inception_v1_tf 224x224 3 350 4224.61
12 medical_seg_cell_tf2 128x128 5.3 350 1504.8
13 MLPerf_resnet50_v1.5_tf 224x224 8.19 350 4519.23
14 mlperf_ssd_resnet34_tf 1200x1200 433 350 76.7817
15 multi_task 288x512 14.8 350 721.458
16 openpose_pruned_0_3 368x368 49.9 350 170.355
17 plate_detection 320x320 0.49 350 8171.92
18 plate_num 96x288 1.75 350 3189.72
19 rcan_pruned_tf 360x640 86.95 350 71.193
20 refinedet_baseline 480x360 123 350 285.822
21 refinedet_pruned_0_8 360x480 25 350 669.851
22 refinedet_pruned_0_92 360x480 10.1 350 858.237
23 refinedet_pruned_0_96 360x480 5.1 350 879.11
24 refinedet_VOC_tf 320x320 81.9 350 399.879
25 RefineDet-Medical_EDD_tf 320x320 9.8 350 1265.4
26 reid 80x160 0.95 350 10933.3
27 resnet_v1_101_tf 224x224 14.4 350 2979.15
28 resnet_v1_152_tf 224x224 21.8 350 2122.98
29 resnet_v1_50_tf 224x224 7 350 4944.3
30 resnet18 224x224 3.7 350 6707.05
31 resnet50 224x224 7.7 350 4946.5
32 resnet50_pt 224x224 4.1 350 4544.06
33 resnet50_tf2 224x224 7.7 350 4167.8
34 salsanext_pt 64x2048 20.4 350 159.507
35 salsanext_v2_pt 64x2048 32 350 89.4142
36 semantic_seg_citys_tf2 512x1024 54 350 117.975
37 SemanticFPN_cityscapes_pt 256x512 10 350 1084.85
38 sp_net 128x224 0.55 350 8459.27
39 squeezenet 227x227 0.76 350 8545.25
40 squeezenet_pt 224x224 0.82 350 7536.39
41 ssd_adas_pruned_0_95 360x480 6.3 350 915.896
42 ssd_pedestrian_pruned_0_97 360x640 5.9 350 752.78
43 ssd_resnet_50_fpn_coco_tf 640x640 178.4 350 114.534
44 ssd_traffic_pruned_0_9 360x480 11.6 350 815.91
45 tiny_yolov3_vmss 416x416 5.46 350 2601.98
46 unet_chaos-CT_pt 512x512 23.3 350 247.929
47 vpgnet_pruned_0_99 480x640 2.5 350 619.718
48 yolov2_voc 448x448 34 350 969.973
49 yolov2_voc_pruned_0_66 448x448 11.6 350 1539.63
50 yolov2_voc_pruned_0_71 448x448 9.9 350 1766.91
51 yolov2_voc_pruned_0_77 448x448 7.8 350 1541.77
52 yolov3_adas_pruned_0_9 256x512 5.5 350 1450.82
53 yolov3_bdd 288x512 53.7 350 400.306
54 yolov3_voc 416x416 65.4 350 476.152
55 yolov3_voc_tf 416x416 65.6 350 476.97
56 ultrafast_pt 288x800 8.4 350 1359.92
57 ocr_pt 960x960 875.7 350 12.6949
58 drunet_pt 528x608 2.59 350 204.593
59 SESR_S_pt 360x640 7.48 350 233.238
60 FairMot_pt 640x480 36 350 506.422
61 fadnet 576x960 441 350 9.64873
62 chen_color_resnet18_pt 224x224 3.627 350 6526.42
63 ofa_rcan_latency_pt 360x640 45.7 350 66.8058
64 textmountain_pt 960x960 575.2 350 49.4783
65 vehicle_make_resnet18_pt 224x224 3.627 350 6910.37
66 vehicle_type_resnet18_pt 224x224 3.627 350 6945.23
67 yolov3-coco_tf2 416x416 65.9 350 469.791