In some designs the size of the output window of a kernel can be different
than the size of the input window of the next kernel. In that case, the connection
declaration will contain the size of the output window and the one of the input
window. For
example:
connect< window<128> , window<192> > net0 ( kernel0.out[0] , kernel1.in[0] );
In
this example, kernel0
writes 128 bytes and kernel1
expects that 192 bytes will be written to the
memory, which causes an erroneous result. To prevent this, the aiecompiler
performs an automatic multi-rate analysis.
In this case, the aiecompiler
will specify that
kernel1
should run twice while kernel0
will run three times. This automatic multirate analysis is enabled by default in the
aiecompiler
. You can also set repetition count for
these kernels in the graph and they take precedence over the values which are
automatically inferred by the aiecompiler
.
repetition_count(kernel0) = 3;
repetition_count(kernel1) = 2;
The --disable-multirate=true
aiecompiler
option can be used to disable the
automatic multi-rate analysis. If this option is set entire design must be a single
rate design. The aiecompiler
will issue an error if
the output window of a kernel has a different size compared to the input window size
of the following kernel.
Just as you can broadcast an output window to multiple input windows
in the graph (automatic DMA insertion mechanism), you can also perform multi-rate
processing in this specific use
case:
connect< window<128> , window<64> > net0 ( kernel0.out[0] , kernel1.in[0] );
connect< window<128> , window<192> > net0 ( kernel0.out[0] , kernel2.in[0] );
In this example, the AI Engine compiler automatically detects that
kernel2
should run twice, kernel0
should run 3 times, and kernel1
should run 6 times for one graph iteration, graph.run(1)
. These repetition counts can also be
specified manually in the graph as well.Note: Each kernel has a user-defined
runtime<ratio>
in the graph. If the repetition_count
of the kernel is more than one, as determined by
the compiler or defined by the user, the real run-time ratio that is taken into
account for kernel placement is: runtime<ratio> x
repetition_count
. This can change the placement of the kernels
drastically. The run-time ratio as determined by the aiecompiler
is reported in the aiecompiler
log file when options --log-level=5
and --verbose=true
are set.