Synchronous Window Connection for Multi-Rate Processing - 2022.2 English

AI Engine Kernel and Graph Programming Guide (UG1079)

Document ID
UG1079
Release Date
2022-10-19
Version
2022.2 English
In some designs the size of the output window of a kernel can be different than the size of the input window of the next kernel. In that case, the connection declaration will contain the size of the output window and the one of the input window. For example:
connect< window<128> , window<192> > net0 ( kernel0.out[0] , kernel1.in[0] );
In this example, kernel0 writes 128 bytes and kernel1 expects that 192 bytes will be written to the memory, which causes an erroneous result. To prevent this, the aiecompiler performs an automatic multi-rate analysis. In this case, the aiecompiler will specify that kernel1 should run twice while kernel0 will run three times.
This automatic multirate analysis is enabled by default in the aiecompiler. You can also set repetition count for these kernels in the graph and they take precedence over the values which are automatically inferred by the aiecompiler.
repetition_count(kernel0) = 3;
repetition_count(kernel1) = 2;

The --disable-multirate=true aiecompiler option can be used to disable the automatic multi-rate analysis. If this option is set entire design must be a single rate design. The aiecompiler will issue an error if the output window of a kernel has a different size compared to the input window size of the following kernel.

Just as you can broadcast an output window to multiple input windows in the graph (automatic DMA insertion mechanism), you can also perform multi-rate processing in this specific use case:
connect< window<128> , window<64> > net0 ( kernel0.out[0] , kernel1.in[0] );
connect< window<128> , window<192> > net0 ( kernel0.out[0] , kernel2.in[0] );
In this example, the AI Engine compiler automatically detects that kernel2 should run twice, kernel0 should run 3 times, and kernel1 should run 6 times for one graph iteration, graph.run(1). These repetition counts can also be specified manually in the graph as well.
Note: Each kernel has a user-definedruntime<ratio> in the graph. If the repetition_count of the kernel is more than one, as determined by the compiler or defined by the user, the real run-time ratio that is taken into account for kernel placement is: runtime<ratio> x repetition_count. This can change the placement of the kernels drastically. The run-time ratio as determined by the aiecompiler is reported in the aiecompiler log file when options --log-level=5 and --verbose=true are set.