Four NBody() Kernels Packet Switched - 2023.2 English

Vitis Tutorials: AI Engine (XD100)

Document ID
XD100
Release Date
2024-03-05
Version
2023.2 English

Next, review the src/nbody_subsystem.h graph. This graph creates four N-Body kernels, a packet splitter kernel, and a packet merger kernel. Review the packet switching feature tutorial to learn more about the packet switching feature in the AI Engine: 04-packet-switching.

The nbody_subsystem graph has two inputs: input_i and input_j. The input_i port is a packet stream that connects to the packet splitter. The packet splitter redirects packets of data to the w_input_i port of each nbody() kernel. Each input_i packet contains a packet header, 224 32-bit data values, and TLAST asserted with the m31 data value. The input_j port is a data stream that is broadcast to all the nbody() kernels (i.e., all nbody() kernels receive the same input_j data). The nbody() kernels perform their computations and generate the new w_output_i data which is merged into a single stream of packets, resulting in the output of the nbody_subsystem graph output_i.

Name Number of 32-bit Data Values Window Size (bytes)
input_i 224 * 4 = 896 896 * 4 = 3584 bytes
input_j 128 128 * 4 = 512 bytes
output_i 224 * 4 = 896 896 * 4 = 3584 bytes

A single instance of the nbody_subsystem graph can simulate 128 particles using four AI Engine tiles.