Why Packet Switching? - 2023.2 English

Vitis Tutorials: AI Engine (XD100)

Document ID
XD100
Release Date
2024-03-05
Version
2023.2 English

You might be curious about the need to implement the packet switching scheme 1:4/4:1. This was done to circumvent an AI Engine architecture limitation on the number of simultaneous input and output AXI-Streams allowed per AI Engine column. There are 50 AI Engine columns in the AI Engine array. Each column contains 8 AI Engine tiles. Each AI Engine column is allowed a maximum of 6 32-bit AXI-Stream inputs and 4 32-bit AXI-Stream outputs.

In the design, each nbody() kernel is mapped to an AI Engine tile. Meaning each column of 8 AI Engine tiles has 9 inputs streams and 8 output streams, violating these constraints.

  • 8 w_input_i input streams

  • 1 w_intput_j input stream

  • 8 w_output_i output streams

With the 1:4/4:1 packet switching scheme, you can combine 4 streams into 1. Because packet switching is applied on the w_input_i ports, the number of input streams into a single AI Engine column is reduced to three:

  • 1 input_i stream that goes to tiles 0-3 in a column

  • 1 input_i stream that goes to tiles 4-7 in a column

  • 1 input_j stream that is broadcasted to all the columns

On the output side, the number of output streams is reduced to two:

  • 1 output_i stream coming from tiles 0-3 in a column

  • 1 output_i stream coming from tiles 4-7 in a column