In Versal™ ACAPs with AI Engines, the processing system (PS) can be used to dynamically load, monitor, and control the graphs that are executing on the AI Engine array. Even if the AI Engine graph is loaded once as a single bitstream image, the PS program can be used to monitor the state of the execution and modify the run-time parameters of the graph.
The graph
base class provides a number of API
methods to control the initialization and execution of the graph that can be used in the
main
program. See Adaptive Data Flow Graph Specification Reference for more details.
Basic Iterative Graph Execution
The following graph control API illustrates how to use graph APIs to
initialize, run, wait, and terminate graphs for a specific number of iterations. A
graph object mygraph
is declared using a
pre-defined graph class called simpleGraph
. Then,
in the main
application, this graph object is
initialized and run. The init()
method loads the
graph to the AI Engine array at prespecified
AI Engine tiles. This includes loading the
ELF binaries for each AI Engine, configuring the
stream switches for routing, and configuring the DMAs for I/O. It leaves the
processors in a disabled state. The run()
method
starts the graph execution by enabling the processors. The run
API is where a specific number of iterations of the graph can be
run by supplying a positive integer argument at run time. This form is useful for
debugging your graph execution.
#include "project.h"
simpleGraph mygraph;
int main(void) {
mygraph.init();
mygraph.run(3); // run 3 iterations
mygraph.wait(); // wait for 3 iterations to finish
mygraph.run(10); // run 10 iterations
mygraph.end(); // wait for 10 iterations to finish
return 0;
}
wait()
is used to wait for the first run to finish before starting the
second run. wait
has the same blocking effect as
end
except that it allows re-running the graph
again without having to re-initialize it. Calling run
back-to-back without an intervening wait
to finish that run can have an unpredictable effect because the
run
API modifies the loop bounds of the active
processors of the graph.Finite Execution of Graph
For finite graph execution, the graph state is maintained across the
graph.run(n)
. The AI Engine
is not reinitialized and memory contents are not cleared after
graph.run(n)
. In the following code example, after the first
run of three invocations, the core-main wrapper code is left in a state where the
kernel will start with the pong buffer in the next run (of ten iterations). The
ping-pong buffer selector state is left as-is. graph.end()
does not
clean up the graph state (specifically, does not re-initialize global variables),
nor clean up stream switch configurations. It merely exits the core-main. To re-run
the graph, you have to reload the PDI/XCLBIN.
#include "project.h"
simpleGraph mygraph;
int main(void) {
mygraph.init();
mygraph.run(3); // run 3 iterations
mygraph.wait(); // wait for 3 iterations to finish
mygraph.run(10); // run 10 iterations
mygraph.end(); // wait for 10 iterations to finish
return 0;
}
Infinite Graph Execution
The following graph control API illustrates how to run the graph infinitely.
#include "project.h"
simpleGraph mygraph;
int main(void) {
mygraph.init(); // load the graph
mygraph.run(); // start the graph
return 0;
}
A graph object mygraph
is declared using a
pre-defined graph class called simpleGraph
. Then,
in the main
application, this graph object is
initialized and run. The init()
method loads the
graph to the AI Engine array at prespecified
AI Engine tiles. This includes loading the
ELF binaries for each AI Engine, configuring the
stream switches for routing, and configuring the DMAs for I/O. It leaves the
processors in a disabled state. The run()
method
starts the graph execution by enabling the processors. This graph runs forever
because the number of iterations to be run is not provided to the run()
method.
graph::run()
without an argument runs the AI Engine kernels for a previously specified number of
iterations (which is infinity by default if the graph is run without any arguments).
If the graph is run with a finite number of iterations, for example,
mygraph.run(3);mygraph.run()
the second run call will also run
for three iterations.Parallel Graph Execution
Among the above API methods, only the wait()
and end()
methods are blocking operations that can
block the main
application indefinitely. Therefore,
if you declare multiple graphs at the top level, you need to interleave the APIs
suitably to execute the graphs in parallel, as shown.
#include "project.h"
simpleGraph g1, g2, g3;
int main(void) {
g1.init(); g2.init(); g3.init();
g1.run(<num-iter>); g2.run(<num-iter>); g3.run(<num-iter>);
g1.end(); g2.end(); g3.end();
return 0;
}
run
) only after it has been initialized
(init
). Also, to get parallel execution, all the graphs must be
started (run
) before any graph is waited upon for termination
(end
).Timed Execution
In multi-rate graphs, all kernels need not execute for the same number of
iterations. In such situations, a timed execution model is more suitable for
testing. There are variants of the wait
and end
APIs with a positive integer that specifies a cycle
timeout. This is the number of AI Engine cycles
that the API call will block before disabling the processors and returning. The
blocking condition does not depend on any graph termination event. The graph can be
in an arbitrary state at the expiration of the timeout.
#include "project.h"
simpleGraph mygraph;
int main(void) {
mygraph.init();
mygraph.run();
mygraph.wait(10000); // wait for 10000 AI Engine cycles
mygraph.resume(); // continue executing
mygraph.end(15000); // wait for another 15000 cycles and terminate
}
resume()
is used to resume execution from the point it was stopped
after the first timeout. resume
only resets the
timer and enables the AI Engines. Calling resume
after the AI Engine execution has already terminated will have no effect.