After being compiled by the Vitis AI compiler,
the neural network model is transformed into an equivalent DPU assembly file, which is
then assembled into one ELF object file by Deep Neural Network Assembler (DNNAS). The
DPU ELF object file is regarded as DPU kernel,
which then becomes one execution unit from the perspective of runtime N2Cube after
invoking the API dpuLoadKernel()
. N2Cube loads the DPU
kernel, including the DPU instructions and network parameters, into the DPU dedicated
memory space and allocate hardware resources. After that, each DPU kernel can be
instantiated into several DPU tasks by calling dpuCreateTask()
to enable the multithreaded programming.