Theoretical throughput - 2023.2 English

Vitis Libraries

Release Date
2023-12-20
Version
2023.2 English

The BSM solver demonstration is configured to build one kernel (consisting of some number of cfBSMEngine engines) that connected to a single DDR bank. This particular design is bandwidth constrained, so the number of usable engines can be determined by considering the data requirements as follows: one single cfBSMEngine instance requires 5 input parameters and returns 6 values (price and Greeks) for a total of 11 float values (44 bytes) transferred every clock cycle. At 200MHz for example, it requires a data bandwidth of 200e6 * 44 = 8.8GB/s. The theoretical bandwidth of a single DDR bank as used in the U200 is 19.2 GB/s so that two cfBSMEngine engines in parallel could achieve 17.6GB/s at perfect utilization. Three engines would exceed the capabilities of one DDR, as would two engines and a higher FMAX, so two engine is the configuration selected for the demonstration design. In practice of course, it will not be possible to achieve the theoretical maximum due to limitations in the AXI interconnect, DDR controller, access patterns and so on.