Run-Time Parameter Specification

Run-time parameters (RTP) is another way to pass data to the kernels. Two types of execution model for run-time parameters are supported.

Asynchronous parameters can be changed at any time by either a controlling processor such as the Arm® , or by another AI Engine. They are read each time a kernel is invoked. This means that the update of parameters occurs between different executions of the kernel, but it does not require the update to take place in a specific pattern. For example, these types of parameters are used as filter coefficients that change infrequently.
Synchronous parameters (triggering parameters) block a kernel from execution until these parameters are written by a controlling processor such as the Arm, or by another AI Engine. Upon a write, the kernel reads the new updated value and executes once. After completion, it is blocked from executing until the parameter is updated again. This allows a different type of execution model from the normal streaming model, which can be useful for certain updating operations where blocking synchronization is important.

The following figure shows how the AI Engine RTP is realized in hardware. The source of RTP can be a port to be written by the controlling processor, or RTP output by an AI Engine kernel. The destination of RTP can be an output to be read by controlling processor, or RTP input of an AI Engine kernel. Source and destination will use ping-pong buffers for the RTP data transferring. Both source and destination will read a selector value to determine if ping or pong of RTP value must be written or read. Before write or read, source and destination will try to lock the buffers before starting kernel executions.

Figure 1. AI Engine RTP

Cascade of RTP is supported as shown in the following figure. Only synchronous-to-synchronous or asynchronous-to-asynchronous modes are allowed for the RTP connection. Async-to-sync or sync-to-async modes are not allowed. RTP port is not allowed to broadcast to multiple destinations.

Figure 2. RTP Cascade

It is very important to understand that the RTP interaction between AI Engine kernels only happens in kernel execution boundaries. This means that the RTP output of the source kernel can only be read by destination kernel when the source kernel has completed its current iteration. If the source and destination rely on each other before finishing or starting kernel executions, it may cause deadlock. For example, if the source and destination are connected by cascade stream besides the RTP connection, the cascade stream will stall the source AI Engine after it is full. However, because the source kernel has not finished its execution, it will not release the RTP data. Thus, the destination AI Engine will never get RTP data and will never start.

Figure 3. RTP Cascade Deadlock Example

Note: RTP ports of AI Engine kernels will need to be acquired lock and released lock before and after kernel execution. This will cause a small overhead for each kernel iteration. When thinking of partitioning the data into frames, the overhead must be taken into consideration according to system level performance requirements.

For more information about run-time parameter usage, refer to the Versal ACAP AI Engine Programming Environment User Guide (UG1076).

Run-Time Parameter Specification - 2021.2 English

AI Engine Kernel Coding Best Practices Guide (UG1079)