The Xilinx® Alveo U280 Data Center accelerator cards are peripheral component interconnect express ( PCIe® ) Gen3x16 compliant and Gen4x8 compatible cards featuring the Xilinx 16 nm UltraScale+ technology. In this release, DPU is implemented in program logic for deep learning inference acceleration.
Note: Some models
cannot run at the highest frequency of DPU and need DPU frequency reduction. See the
For Edge for DPU frequency reduction
operation.
U280 Performance with 14E300 MHz DPU
Refer to the following table for the throughput performance (in frames/sec or fps) for various neural network samples on U280 Gen3x16 with DPUCAHX8H running at 14E@300 MHz.
No | Neural Network | Input Size | GOPS | DPU Frequency (MHz) | Performance (fps) (Multiple thread) |
---|---|---|---|---|---|
1 | densebox_320_320 | 320x320 | 0.49 | 300x0.5 | 2999 |
2 | densebox_640_360 | 360x640 | 1.1 | 300x0.5 | 1338.8 |
3 | ENet_cityscapes_pt | 512x1024 | 8.6 | 300x0.5 | 89.3 |
4 | face_landmark | 96x72 | 0.14 | 300x0.5 | 12384.7 |
5 | face-quality | 80x60 | 0.06 | 300x0.5 | 23735.6 |
6 | face-quality_pt | 80x60 | 0.06 | 300x0.5 | 23559.9 |
7 | facerec_resnet20 | 112x96 | 3.5 | 300x0.5 | 1602.1 |
8 | facerec-resnet20_mixed_pt | 112x96 | 3.5 | 300x0.5 | 1602.2 |
9 | facerec_resnet64 | 112x96 | 11 | 300x0.5 | 580.5 |
10 | facereid-large_pt | 96x96 | 0.5 | 300x0.5 | 9374 |
11 | facereid-small_pt | 80x80 | 0.09 | 300x0.5 | 20726 |
12 | fpn | 256x512 | 8.9 | 300x0.5 | 416.8 |
13 | FPN_Res18_Medical_segmentation | 320x320 | 45.3 | 300x0.5 | 116.3 |
14 | FPN-resnet18_covid19-seg_pt | 352x352 | 22.7 | 300x0.5 | 263.2 |
15 | inception_resnet_v2_tf | 299x299 | 26.4 | 300x0.5 | 191 |
16 | inception_v1 | 224x224 | 3.2 | 300x0.5 | 1367.4 |
17 | inception_v1_tf | 224x224 | 3 | 300x0.5 | 1415.4 |
18 | inception_v2 | 224x224 | 4 | 300x0.5 | 1096.3 |
19 | inception_v3 | 299x299 | 11.4 | 300x0.5 | 442.2 |
20 | inception_v3_pt | 299x299 | 5.7 | 300x0.5 | 442.4 |
21 | inception_v3_tf | 299x299 | 11.5 | 300x0.5 | 442.2 |
22 | inception_v3_tf2 | 299x299 | 11.5 | 300x0.5 | 451.7 |
23 | inception_v4 | 299x299 | 24.5 | 300x0.5 | 209.2 |
24 | inception_v4_2016_09_09_tf | 299x299 | 24.6 | 300x0.5 | 209.3 |
25 | medical_seg_cell_tf2 | 128x128 | 5.3 | 300x0.5 | 1275.5 |
26 | MLPerf_resnet50_v1.5_tf | 224x224 | 8.19 | 300x0.5 | 647.3 |
27 | mlperf_ssd_resnet34_tf | 1200x1200 | 433 | 300x0.5 | 13.9 |
28 | multi_task | 288x512 | 14.8 | 300x0.5 | 344.2 |
29 | openpose_pruned_0_3 | 368x368 | 49.9 | 300x0.5 | 37.3 |
30 | personreid-res18_pt | 176x80 | 1.1 | 300x0.5 | 4529.3 |
31 | personreid-res50_pt | 256x128 | 5.4 | 300x0.5 | 1044.6 |
32 | plate_detection | 320x320 | 0.49 | 300x0.5 | 4146.4 |
33 | plate_num | 96x288 | 1.75 | 300x0.5 | 1201.5 |
34 | refinedet_baseline | 480x360 | 123 | 300x0.5 | 55.8 |
35 | RefineDet-Medical_EDD_tf | 320x320 | 9.8 | 300x0.5 | 512.9 |
36 | refinedet_pruned_0_8 | 360x480 | 25 | 300x0.5 | 183.9 |
37 | refinedet_pruned_0_92 | 360x480 | 10.1 | 300x0.5 | 455 |
38 | refinedet_pruned_0_96 | 360x480 | 5.1 | 300x0.5 | 637.5 |
39 | refinedet_VOC_tf | 320x320 | 81.9 | 300x0.5 | 87.8 |
40 | reid | 80x160 | 0.95 | 300x0.5 | 4769.6 |
41 | resnet18 | 224x224 | 3.7 | 300x0.5 | 1660.3 |
42 | resnet50 | 224x224 | 7.7 | 300x0.6 | 801.1 |
43 | resnet50_pt | 224x224 | 4.1 | 300x0.6 | 776 |
44 | resnet50_tf2 | 224x224 | 7.7 | 300x0.5 | 668 |
45 | resnet_v1_101_tf | 224x224 | 14.4 | 300x0.5 | 390.1 |
46 | resnet_v1_152_tf | 224x224 | 21.8 | 300x0.5 | 260 |
47 | resnet_v1_50_tf | 224x224 | 7 | 300x0.5 | 751.9 |
48 | salsanext_pt | 64x2048 | 20.4 | 300x0.9 | 108.5 |
49 | SemanticFPN_cityscapes_pt | 256x512 | 10 | 300x0.5 | 429.2 |
50 | semantic_seg_citys_tf2 | 512x1024 | 54 | 300x0.5 | 65.5 |
51 | sp_net | 128x224 | 0.55 | 300x0.5 | 2939.3 |
52 | squeezenet | 227x227 | 0.76 | 300x0.5 | 3264.5 |
53 | squeezenet_pt | 224x224 | 0.82 | 300x0.5 | 2140.1 |
54 | ssd_adas_pruned_0_95 | 360x480 | 6.3 | 300x0.5 | 641.6 |
55 | ssd_pedestrian_pruned_0_97 | 360x360 | 5.9 | 300x0.5 | 544.6 |
56 | ssd_resnet_50_fpn_coco_tf | 640x640 | 178.4 | 300x0.5 | 36.9 |
57 | ssd_traffic_pruned_0_9 | 360x480 | 11.6 | 300x0.5 | 432.8 |
58 | tiny_yolov3_vmss | 416x416 | 5.46 | 300x0.5 | 962.9 |
59 | unet_chaos-CT_pt | 512x512 | 23.3 | 300x0.5 | 129.8 |
60 | vgg_16_tf | 224x224 | 31 | 300x0.5 | 188.6 |
61 | vgg_19_tf | 224x224 | 39.3 | 300x0.5 | 157.2 |
62 | vpgnet_pruned_0_99 | 480x640 | 2.5 | 300x0.5 | 634.9 |
63 | yolov2_voc | 448x448 | 34 | 300x0.5 | 202.4 |
64 | yolov2_voc_pruned_0_66 | 448x448 | 11.6 | 300x0.5 | 499.3 |
65 | yolov2_voc_pruned_0_71 | 448x448 | 9.9 | 300x0.5 | 583.4 |
66 | yolov2_voc_pruned_0_77 | 448x448 | 7.8 | 300x0.5 | 695.3 |
67 | yolov3_adas_pruned_0_9 | 256x512 | 5.5 | 300x0.5 | 810.9 |
68 | yolov3_bdd | 288x512 | 53.7 | 300x0.5 | 93.7 |
69 | yolov3_voc | 416x416 | 65.4 | 300x0.5 | 96.9 |
70 | yolov3_voc_tf | 416x416 | 65.6 | 300x0.5 | 97 |
71 | yolov4_leaky_spp_m | 416x416 | 60.1 | 300x0.5 | 99.4 |
Model End-to-End Performance on U280 250 MHz dual cores DPUCAHX8L
Refer to the following table for the throughput performance (in frames/sec or fps) for various neural network samples on U280 Gen3x16 with dual DPUCAHX8L running at 300 MHz.
No | Neural Network | Input Size | GOPS | Performance (fps) (Single thread) | Performance (fps) (Multiple thread) |
---|---|---|---|---|---|
1 | densebox_320_320 | 320x320 | 0.49 | 190.6 | 464.1 |
2 | densebox_640_360 | 360x640 | 1.1 | 92.4 | 209.5 |
3 | ENet_cityscapes_pt | 512x1024 | 8.6 | 4.5 | 8.6 |
4 | face_landmark | 96x72 | 0.14 | 1848.4 | 6682.2 |
5 | face-quality | 80x60 | 0.06 | 2131.1 | 9672.5 |
6 | face-quality_pt | 80x60 | 0.06 | 2311.7 | 9949.8 |
7 | facerec_resnet20 | 112x96 | 3.5 | 255.8 | 430.6 |
8 | facerec-resnet20_mixed_pt | 112x96 | 3.5 | 255.8 | 446.8 |
9 | facerec_resnet64 | 112x96 | 11 | 127.0 | 247.9 |
10 | facereid-small_pt | 80x80 | 0.09 | 1756.7 | 6408.1 |
11 | fpn | 256x512 | 8.9 | 24.5 | 55.3 |
12 | FPN_Res18_Medical_segmentation | 320x320 | 45.3 | 12.9 | 20.4 |
13 | FPN-resnet18_covid19-seg_pt | 352x352 | 22.7 | 69.6 | 143.8 |
14 | inception_resnet_v2_tf | 299x299 | 26.4 | 29.8 | 60.5 |
15 | inception_v1 | 224x224 | 3.2 | 214.9 | 504.8 |
16 | inception_v1_tf | 224x224 | 3 | 201.0 | 491.4 |
17 | inception_v2 | 224x224 | 3.88 | 148.0 | 315.7 |
18 | inception_v3 | 299x299 | 11.4 | 85.5 | 183.0 |
19 | inception_v3_pt | 299x299 | 5.7 | 80.9 | 180.4 |
20 | inception_v3_tf | 299x299 | 11.5 | 80.8 | 179.4 |
21 | inception_v3_tf2 | 299x299 | 11.5 | 77.6 | 171.1 |
22 | inception_v4 | 299x299 | 24.5 | 44.4 | 91.2 |
23 | inception_v4_2016_09_09_tf | 299x299 | 24.6 | 43.2 | 88.5 |
24 | medical_seg_cell_tf2 | 128x128 | 5.3 | 72.8 | 102.1 |
25 | MLPerf_resnet50_v1.5_tf | 224x224 | 8.19 | 74.1 | 153.5 |
26 | mlperf_ssd_resnet34_tf | 1200x1200 | 433 | 4.1 | 10.7 |
27 | mobilenet_1_0_224_tf2 | 224x224 | 1.1 | 558.0 | 2328.6 |
28 | mobilenet_v1_0_5_160_tf | 160x160 | 0.15 | 1067.3 | 6310.9 |
29 | mobilenet_v1_1_0_224_tf | 224x224 | 1.1 | 558.7 | 2520.8 |
30 | mobilenet_v2 | 224x224 | 0.6 | 480.1 | 1466.8 |
31 | mobilenet_v2_1_0_224_tf | 224x224 | 0.6 | 436.4 | 1500.6 |
32 | mobilenet_v2_1_4_224_tf | 224x224 | 1.2 | 370.4 | 1141.7 |
33 | multi_task | 288x512 | 14.8 | 13.4 | 31.0 |
34 | openpose_pruned_0_3 | 368x368 | 49.9 | 9.6 | 25.4 |
35 | personreid-res50_pt | 256x128 | 5.4 | 92.4 | 187.4 |
36 | plate_detection | 320x320 | 0.49 | 390.2 | 1527.8 |
37 | refinedet_baseline | 480x360 | 123 | 25.1 | 51.2 |
38 | RefineDet-Medical_EDD_tf | 320x320 | 9.8 | 81.3 | 216.8 |
39 | refinedet_pruned_0_8 | 360x480 | 25 | 49.3 | 108.8 |
40 | refinedet_pruned_0_92 | 360x480 | 10.1 | 53.5 | 112.7 |
41 | refinedet_pruned_0_96 | 360x480 | 5.1 | 61.7 | 133.8 |
42 | refinedet_VOC_tf | 320x320 | 81.9 | 24.8 | 69.0 |
43 | reid | 80x160 | 0.95 | 407.4 | 796.8 |
44 | resnet18 | 224x224 | 3.7 | 203.2 | 396.2 |
45 | resnet50 | 224x224 | 7.7 | 65.7 | 126.3 |
46 | resnet50_pt | 224x224 | 4.1 | 64.4 | 131.7 |
47 | resnet50_tf2 | 224x224 | 7.7 | 76.0 | 152.9 |
48 | resnet_v1_101_tf | 224x224 | 14.4 | 49.8 | 99.5 |
49 | resnet_v1_152_tf | 224x224 | 21.8 | 34.9 | 63.5 |
50 | resnet_v1_50_tf | 224x224 | 7 | 84.0 | 168.5 |
51 | retinaface | 360x640 | 1.11 | 91.8 | 207.0 |
52 | salsanext_pt | 64x2048 | 20.4 | 7.0 | 16.1 |
53 | SemanticFPN_cityscapes_pt | 256x512 | 10 | 24.5 | 44.6 |
54 | semantic_seg_citys_tf2 | 512x1024 | 54 | 4.6 | 9.2 |
55 | sp_net | 128x224 | 0.55 | 330.6 | 7747.6 |
56 | squeezenet | 227x227 | 0.76 | 361.8 | 1043.1 |
57 | squeezenet_pt | 224x224 | 0.82 | 266.5 | 688.4 |
58 | ssd_adas_pruned_0_95 | 360x480 | 6.3 | 63.4 | 160.3 |
59 | ssdlite_mobilenet_v2_coco_tf | 300x300 | 1.5 | 209.8 | 714.3 |
60 | ssd_mobilenet_v1_coco_tf | 300x300 | 2.5 | 268.1 | 1095.0 |
61 | ssd_mobilenet_v2 | 360x480 | 6.6 | 38.7 | 169.3 |
62 | ssd_mobilenet_v2_coco_tf | 300x300 | 3.8 | 128.7 | 303.7 |
63 | ssd_pedestrian_pruned_0_97 | 360x360 | 5.9 | 23.0 | 35.1 |
64 | ssd_traffic_pruned_0_9 | 360x480 | 11.6 | 51.6 | 158.0 |
65 | vgg_16_tf | 224x224 | 31 | 50.3 | 99.0 |
66 | vgg_19_tf | 224x224 | 39.3 | 45.4 | 89.6 |
67 | vpgnet_pruned_0_99 | 480x640 | 2.5 | 18.0 | 20.5 |