Version: Vitis 2023.1
This tutorial is an implementation of an N-Body Simulator in the AI Engine. It is a system-level design that uses the AI Engine, PL, and PS resources to showcase the following features:
A Python model of an N-Body Simulator run on x86 machine
A scalable AI Engine design that can utilize up to 400 AI Engine tiles
AI Engine packet switching
AI Engine single-precision floating point calculations
AI Engine 1:400 broadcast streams
Codeless PL HLS datamover kernels from the AMD Vitis™ Utility Library
PL HLS packet switching kernels
PS Host Application that validates the data coming out of the AI Engine design
C++ model of an N-Body Simulator
Performance comparisons between Python x86, C++ Arm A72, and AI Engine N-Body Simulators
Effective throughput calculation (GFLOPS) vs. Theoretical peak throughput of AI Engine
Before You Begin
This tutorial can be run on the VCK190 Board (Production or ES). If you have already purchased this board, download the necessary files from the lounge and ensure you have the correct licenses installed. If you do not have a board, get in touch with your AMD sales contact.
Documentation: Explore AI Engine Architecture
Tools: Installing the Tools
Obtain a license to enable beta devices in AMD tools (to use the VCK190 platform).
Obtain licenses for AI Engine tools.
Follow the instructions for the Vitis Software Platform Installation and ensure you have the following tools:
Environment: Setting Up Your Shell Environment
When the elements of the Vitis software platform are installed, update the shell environment script. Set the necessary environment variables to your system specific paths for xrt, platform location, and AMD tools.
sample_env_setup.shscript with your file paths:
export XILINX_VITIS = <XILINX-INSTALL-LOCATION>/Vitis/<ver>
export PLATFORM=xilinx_vck190_base_<ver> #or xilinx_vck190_es1_base_<ver> is using an ES1 board
export DSPLIB_VITIS=<Path to Vitis Libs - Directory>
Source the environment script:
Validation: Confirming Tool Installation
Ensure you are using the 2023.1 version of the AMD tools.
Goals of this Tutorial
The goal of this tutorial is to create a general-purpose floating point accelerator for HPC applications. This tutorial demonstrates a x24,800 performance improvement using the AI Engine accelerator over the naive C++ implementation on the A72 embedded Arm® processor.
A similar accelerator example was implemented on the AMD UltraScale+™-based Ultra96 device using only PL resources here.
|Average Execution Time to Simulate 12,800 Particles for 1 Timestep (seconds)
|Python N-Body Simulator
|x86 Linux Machine
|C++ N-Body Simulator
|A72 Embedded Arm Processor
|AI Engine N-Body SImulator
|Versal AI Engine IP
PL Data-Mover Kernels
Another goal of this tutorial is to showcase how to generate PL Data-Mover kernels from the AMD Vitis Utility Library. These kernels moves any amount of data from DDR buffers to AXI-Streams.
The N-Body Problem
The N-Body problem is the problem of predicting the motions of a group of N objects which each have a gravitational force on each other. For any particle
i in the system, the summation of the gravitational forces from all the other particles results in the acceleration of particle
i. From this acceleration, we can calculate a particle’s velocity and position (
x y z vx vy vz) will be in the next timestep. Newtonian physics describes the behavior of very large bodies/particles within our universe. With certain assumptions, the laws can be applied to bodies/particles ranging from astronomical size to a golf ball (and even smaller).