Profiling Port Throughput - 2021.2 English

Versal ACAP AI Engine Programming Environment User Guide (UG1076)

Document ID
Release Date
2021.2 English

Port throughput can be measured by a count of the number of samples are sent in a specific time. Xilinx provides event::io_stream_running_event_count enumeration to count the running event, which corresponds to the number of samples sent.

After the graph runs, and data transfer from or to the port is stable, the following code can be inserted in the host code to measure the port throughput.

int wait_time_us=2000000;
event::handle handle = event::start_profiling(*plio_port, vent::io_stream_running_event_count);
    printf("ERROR:Invalid handle. Only two performance counter in a AIE-PL interface tile\n");
    return 1;
long long count0 = event::read_profiling(handle); 
long long count1 = event::read_profiling(handle); 
long long samples = count1 - count0; 
std::cout << "num runnning samples: " << samples << std::endl; 
std::cout << " Throughput: " << samples / wait_time_us << " MSPS " << std::endl;

This method can be used for an infinite running graph, or just to count how many samples are sent or received before the graph is stalled (for whatever reason).

To minimize the variance of accuracy, it is advised to run for many seconds in hardware. Accuracy of this method can vary in hardware emulation.

For the AI Engine simulator, this profiling method applies too. You need to replace usleep with the wait function in SystemC, and the wait time needs to be much smaller, because it is much slower in simulation. For example, the sleep function in the preceding code can be replaced with following function call for the AI Engine simulator.