In the AI Engine kernel code (aie/kernels/hb24.cc
), the interface is declared as:
void fir24_sym(input_window<cint16> * iwin, output_window<cint16> * owin, const int32(&coeffs)[12], int32(&coeffs_readback)[12]);
For the RTP array input, const
is used for the array reference. From the graph, the RTP port can only be input
or inout
. The inout
port in the graph can only be read by the PS program, it cannot be written by the PS program. Therefore, another port coeffs_readback
is defined to read back the coefficient.
In the graph definition (aie/graph.h
), the RTP declaration and connection are added as follows:
port<direction::in> coefficients;
port<direction::inout> coefficients_readback;
connect< parameter >(coefficients, async(fir24.in[1]));
connect< parameter >(async(fir24.inout[0]),coefficients_readback);
In aie/graph.cpp
(for AI Engine simulator), the RTP update and read commands are:
gr.update(gr.coefficients, narrow_filter, 12);
gr.run(16); // start PL kernel & AIE kernel
gr.read(gr.coefficients_readback,coeffs_readback,12);
std::cout<<"Coefficients read back are:";
for(int i=0;i<12;i++)std::cout<<coeffs_readback[i]<<",\t";
std::cout<<coeffs_readback[i]<<",\t";
}
std::cout<<std::endl;
gr.wait(); // wait PL kernel & AIE kernel to complete
gr.read(gr.coefficients_readback,coeffs_readback,12);
std::cout<<"Coefficients read back are:";
for(int i=0;i<12;i++){
std::cout<<coeffs_readback[i]<<",\t";
}
std::cout<<std::endl;
gr.update(gr.coefficients, wide_filter, 12);
gr.run(16); // start PL kernel & AIE kernel
gr.read(gr.coefficients_readback,coeffs_readback,12);
std::cout<<"Coefficients read back are:";
for(int i=0;i<12;i++){
std::cout<<coeffs_readback[i]<<",\t";
}
std::cout<<std::endl;
Because the RTP read is asynchronous, it can not ensure that reads are happening in between kernels running. Only after graph wait()
runs (which is a synchronization point), are the coefficients updated and can be read back.