The xclbin could be built in 333 MHz The hardware resource utilization and benchmark results are shown in the two tables below.
Table 1 Hardware resources
Name | LUT | LUTAsMem | REG | BRAM | URAM | DSP |
---|---|---|---|---|---|---|
krnl_gemv | 121535 [ 16.26%] | 11002 [ 2.85%] | 215897 [ 13.72%] | 72 [ 6.19%] | 0 [ 0.00%] | 966 [ 16.27%] |
streamTimer | 195 [ 0.03%] | 0 [ 0.00%] | 291 [ 0.02%] | 0 [ 0.00%] | 0 [ 0.00%] | 0 [ 0.00%] |
Table 2 Benchmark results
M | N | hw execution time (s) | cold api execution time (s) | hot api execution time (s) | execution clock cycles | efficiency |
---|---|---|---|---|---|---|
512 | 256 | 1.4481e-05 | 0.000241345 | 0.00014245 | 4827 | 42.428% |
512 | 512 | 2.0853e-05 | 0.000428344 | 0.000136975 | 6951 | 58.9268% |
1024 | 1024 | 6.6462e-05 | 0.000439357 | 0.00017869 | 22154 | 73.955% |
2048 | 2048 | 0.000248076 | 0.000637851 | 0.000367888 | 82692 | 79.2531% |
4096 | 4096 | 0.000898929 | 0.00156095 | 0.00101729 | 299643 | 87.4854% |
8192 | 8192 | 0.00332855 | 0.00478017 | 0.00365307 | 1109516 | 94.5075% |