Next, review the src/nbody_subsystem.h
graph. This graph creates four N-Body kernels, a packet splitter kernel, and a packet merger kernel. Review the packet switching feature tutorial to learn more about the packet switching feature in the AI Engine: 04-packet-switching.
The nbody_subsystem
graph has two inputs: input_i
and input_j
. The input_i
port is a packet stream that connects to the packet splitter. The packet splitter redirects packets of data to the w_input_i
port of each nbody()
kernel. Each input_i
packet contains a packet header, 224 32-bit data values, and TLAST asserted with the m31
data value. The input_j
port is a data stream that is broadcast to all the nbody()
kernels (i.e., all nbody()
kernels receive the same input_j
data). The nbody()
kernels perform their computations and generate the new w_output_i
data which is merged into a single stream of packets, resulting in the output of the nbody_subsystem
graph output_i
.
Name | Number of 32-bit Data Values | Window Size (bytes) |
---|---|---|
input_i | 224 * 4 = 896 | 896 * 4 = 3584 bytes |
input_j | 128 | 128 * 4 = 512 bytes |
output_i | 224 * 4 = 896 | 896 * 4 = 3584 bytes |
A single instance of the nbody_subsystem
graph can simulate 128 particles using four AI Engine tiles.