Note: Vitis AI for Vitis 2023.1 is not released. This will be updated to Vitis 2023.1 soon after Vitis AI for Vitis 2023.1 released.
Vitis AI Demo
This test will run a Vitis AI test application in the DPU-TRD to verify DPU function on your custom platform. The most instructions below follows Vitis-AI DPU-TRD document.
Create the Design
Add the Vitis AI repository into the Vitis IDE
Launch Vitis IDE if you have not. You can reuse the workspace of vadd application.
Select Window -> Preferences.
Go to the Library Repository tab.
Add Vitis AI:
Click Add button
Input ID: vitis-ai
Name: Vitis AI
Location: Assign a target download directory or keep empty. Vitis will use default path
~/.Xilinx
if this field is empty.Git URL:
https://github.com/Xilinx/Vitis-AI.git
Branch: Verify the branch with your platform. Use
master
for the Vitis AI version that matches Vitis 2021.1. You can usemaster
for the latest patched version. Note that the master branch will move forward. At some pointmaster
branch will point to a new release that may not be compatible with Vitis 2021.2.
Download the Vitis AI library.
Select Xilinx -> Libraries.
Find the Vitis AI entry that you just added. Click the Download button on it.
Wait until tVitis AI repository downloads.
Click OK to close this window.
The Vitis IDE will check the upstream status of each repository. If there are updates, it will allow you to download the updates if the source URL is a remote Git repository.
Download Vitis AI specific sysroot.
Since Vitis AI has a different release cycle with PetaLinux, Vitis AI related PetaLinux recipes are released later than PetaLinux release. At the time that this tutorial releases, Vitis AI related recipes are not released yet. You cannot build PetaLinux
sysroot/sdk
with Vitis AI dependencies. Use the pre-built Vitis AI SDK instead.Download the Vitis AI cross-compile environment setup script:
wget https://raw.githubusercontent.com/Xilinx/Vitis-AI/1.4/setup/mpsoc/VART/host_cross_compiler_setup.sh
.Update the script for installation area. The default install path is
install_path=~/petalinux_sdk_2021.1
. Since you are using PetaLinux 2021.2, it is recommended that you changeinstall_path=~/petalinux_sdk_2021.2
.Run the script to set up the cross compile environment:
./host_cross_compiler_setup.sh
.
Once Vitis AI recipes are released, this tutorial will update the steps for building Vitis AI dependencies to the
sysroot
using PetaLinux.Create a Vitis AI design on the
zcu104_custom
platform.Go to menu File -> New -> Application Project
Click Next in Welcome page
Select platform zcu104_custom. Click Next.
Name the project dpu_trd, click next.
Set Domain to linux on psu_cortexa53, set Sys_root path to
sysroot
installation path in previous step, for example,~/petalinux_sdk_2021.2/sysroots/cortexa72-cortexa53-xilinx-linux/
.Set the Root FS to
rootfs.ext4
and Kernel Image to Image. These files are located inzcu104_software_platform/sw_comp
directory, which are generated in Step 2. click next.Select dsa -> DPU Kernel (RTL Kernel) and click Finish to generate the application.
Update Build Target.
Double-click the system project file
dpu_trd_system.sprj
.Change Active Build Configuration to Hardware
Review and update DPU settings for ZCU104. The default created design has the DPU settings for ZCU102.
Open dpu_conf.vh from dpu_trd_kernels/src/prj/Vitis directory
Update line 37 from
URAM_DISABLE
toURAM_ENABLE
Press Ctrl+S to save the changes.
Note: ZCU104 has ZU7EV device on board. It has less block RAM than ZU9EG on ZCU102 but it has UltraRAM. Turning on UltraRAM can fulfill the on chip memory requirement of the DPU.
Update system_hw_link for proper kernel instantiation.
Since the ZCU104 has less LUT resources than the ZCU102, it is more difficult to meet the timing closure target if you include the softmax IP in PL like ZCU102. The implementation could take a long time. The Vitis AI DPU-TRD design removes the softmax IP in hardware for ZCU104. When the host application detects no softmax IP in hardware, it calculates softmax with software. The result is identical but the calculation time is different. Since your target is to verify the platform, it is recommended that you remove the softmax kernel in your test application.
Double-click
dpu_trd_system_hw_link.prj
.In the Hardware Functions window, remove the
sfm_xrt_top
instance by right-clicking it and select Remove.After removing the
sfx_xrt_top
instance, the remaining instances in Hardware Functions window is DPUCZDX8G with Compute Units = 2.
Review system_hw_link v++ for proper kernel instantiation.
The DPU kernel requires two phase-aligned clocks, 1x clock and 2x clock. The configuration is stored in the example design. It sets up clock and AXI interface connections between the DPU kernel to the platform.
To review the setup in the project, follow these steps:
Go to Assistant View.
Double-click dpu_trd_system [System].
Expand the left tree panel and find dpu_trd_system -> dpu_trd_system_hw_link -> Hardware -> dpu.
Click
...
button on the line of V++ Configuration Settings, it shows the configuration like this:[clock] freqHz=300000000:DPUCZDX8G_1.aclk freqHz=600000000:DPUCZDX8G_1.ap_clk_2 freqHz=300000000:DPUCZDX8G_2.aclk freqHz=600000000:DPUCZDX8G_2.ap_clk_2 [connectivity] sp=DPUCZDX8G_1.M_AXI_GP0:HPC0 sp=DPUCZDX8G_1.M_AXI_HP0:HP0 sp=DPUCZDX8G_1.M_AXI_HP2:HP1 sp=DPUCZDX8G_2.M_AXI_GP0:HPC0 sp=DPUCZDX8G_2.M_AXI_HP0:HP2 sp=DPUCZDX8G_2.M_AXI_HP2:HP3
Note: The contents are written to
dpu-link.cfg
during build time and passed to the v++ Linker command line.Note: To customize the v++ link configuration, you can add contents in the V++ configuration settings, or create your own configuration file and add
--config <your_config_file.cfg>
to the V++ Command Line Options field. If you need to use relative path for the configuration file, the base location isdpu_trd_system_hw_link/Hardware
directory.Update package options to add dependency models into SD Card
Double-click
dpu_trd_system.sprj
.Click … button on Package options.
Input
--package.sd_dir=../../dpu_trd/src/app
.Click OK.
All content in the
--package.sd_dir
assigned directory will be added to the FAT32 partition of thesd_card.img
. Samples and models are packaged for verification.The
dpu_trd
in the path name is the application project name in this example. If your project name is different, update the project name accordingly.Build the hardware design.
Select the
dpu_trd_system
system project.Click the hammer button to build the system project.
The generated SD card image is located at dpu_trd_system/Hardware/package/sd_card.img.
Note: Refer to the Vitis AI document for details about the Vitis AI project creation flow.
Run Application on Board
Write image to SD card.
Copy the
sd_card.img
to a local workstation or laptop with SD card readers.Write the image to SD card with Balena Etcher or similar tools.
Boot the board.
Insert the SD card to ZCU104.
Set boot mode to SD boot.
Connect USB UART cable.
Power on the board. It should boot Linux properly in a minute.
Resize ext4 partition
Connect UART console if it is not connected.
On the ZCU104 board UART console, run
df .
to check current available disk size.root@petalinux:~# df . Filesystem 1K-blocks Used Available Use% Mounted on /dev/root 564048 398340 122364 77% /
Run
resize-part /dev/mmcblk0p2
to resize the ext4 partition. Input Yes and 100% for confirming the resize to utilize full of the rest of SD card.root@petalinux:~# resize-part /dev/mmcblk0p2 /dev/mmcblk0p2 Warning: Partition /dev/mmcblk0p2 is being used. Are you sure you want to continue? parted: invalid token: 100% Yes/No? yes End? [2147MB]? 100% Information: You may need to update /etc/fstab. resize2fs 1.45.3 (14-Jul-2019) Filesystem at /dev/mmcblk0p2 is mounted on /media/sd-mmcblk0p2; o[ 72.751329] EXT4-fs (mmcblk0p2): resizing filesystem from 154804 to 1695488 blocks n-line resizing required old_desc_blocks = 1, new_desc_blocks = 1 [ 75.325525] EXT4-fs (mmcblk0p2): resized filesystem to 1695488 The filesystem on /dev/mmcblk0p2 is now 1695488 (4k) blocks long.
Check available size again to verify that the ext4 partition size is enlarged.
root@petalinux:~# df . -h Filesystem Size Used Available Use% Mounted on /dev/root 6.1G 390.8M 5.4G 7% /
Note: The available size would be different according to your SD card size.
Note: resize-part is a script that you added in Step 2. It calls Linux utilities parted and resize2fs to extend the ext4 partition to take the rest of the SD card.
Copy dependency files to home folder.
# Libraries root@petalinux:~# cp -r /mnt/sd-mmcblk0p1/app/samples/ ~ # Model root@petalinux:~# cp /mnt/sd-mmcblk0p1/app/model/resnet50.xmodel ~ # Host app root@petalinux:~# cp /mnt/sd-mmcblk0p1/dpu_trd ~ # Image to test root@petalinux:~# cp /mnt/sd-mmcblk0p1/app/img/bellpeppe-994958.JPEG ~
Run the application.
root@petalinux:~# env LD_LIBRARY_PATH=samples/lib XLNX_VART_FIRMWARE=/mnt/sd-mmcblk0p1/dpu.xclbin ./dpu_trd bellpeppe-994958.JPEG
It would show bell pepper has the highest possibility.
score[945] = 0.992235 text: bell pepper, score[941] = 0.00315807 text: acorn squash, score[943] = 0.00191546 text: cucumber, cuke, score[939] = 0.000904801 text: zucchini, courgette, score[949] = 0.00054879 text: strawberry,
Detailed Log
[ 196.247066] [drm] Pid 948 opened device
[ 196.250926] [drm] Pid 948 closed device
[ 196.254833] [drm] Pid 948 opened device
[ 196.258679] [drm] Pid 948 closed device
[ 196.269515] [drm] Pid 948 opened device
[ 196.273384] [drm] Pid 948 closed device
[ 196.277243] [drm] Pid 948 opened device
[ 196.281076] [drm] Pid 948 closed device
[ 196.285073] [drm] Pid 948 opened device
[ 196.288984] [drm] Pid 948 closed device
[ 196.293230] [drm] Pid 948 opened device
[ 196.297096] [drm] Pid 948 closed device
[ 196.300963] [drm] Pid 948 opened device
[ 196.307660] [drm] zocl_xclbin_read_axlf The XCLBIN already loaded
[ 196.307672] [drm] zocl_xclbin_read_axlf 1cdede23-0755-458e-8dac-7ef1b3845fa4 ret: 0
[ 196.317747] [drm] bitstream 1cdede23-0755-458e-8dac-7ef1b3845fa4 locked, ref=1
[ 196.325431] [drm] Reconfiguration not supported
[ 196.337206] [drm] bitstream 1cdede23-0755-458e-8dac-7ef1b3845fa4 unlocked, ref=0
[ 196.337361] [drm] Pid 948 opened device
[ 196.348581] [drm] Pid 948 closed device
[ 196.352580] [drm] Pid 948 opened device
[ 196.356638] [drm] bitstream 1cdede23-0755-458e-8dac-7ef1b3845fa4 locked, ref=1
[ 196.356659] [drm] Pid 948 opened device
[ 196.367712] [drm] Pid 948 closed device
[ 196.371560] [drm] Pid 948 opened device
[ 196.375507] [drm] bitstream 1cdede23-0755-458e-8dac-7ef1b3845fa4 locked, ref=2
[ 196.375539] [drm] Pid 948 opened device
[ 196.386590] [drm] Pid 948 closed device
[ 196.390439] [drm] Pid 948 opened device
[ 196.394331] [drm] bitstream 1cdede23-0755-458e-8dac-7ef1b3845fa4 locked, ref=3
[ 196.394822] [drm] Pid 948 opened device
[ 196.405867] [drm] Pid 948 closed device
[ 196.409717] [drm] Pid 948 opened device
score[945] = 0.992235 text: bell pepper,
score[941] = 0.00315807 text: acorn squash,
score[943] = 0.00191546 text:[ 196.413579] [drm] bitstream 1cdede23-0755-458e-8dac-7ef1b3845fa4 locked, ref=4
cucumber, cuke,
score[939] = 0.000904801 text: zucchini, co[ 197.997865] [drm] bitstream 1cdede23-0755-458e-8dac-7ef1b3845fa4 unlocked, ref=3
urgette,
score[949] = 0.00054879 text: strawberry,
[ 198.010569] [drm] Pid 948 closed device
[ 198.032534] [drm] bitstream 1cdede23-0755-458e-8dac-7ef1b3845fa4 unlocked, ref=2
[ 198.032546] [drm] Pid 948 closed device
[ 198.229797] [drm] bitstream 1cdede23-0755-458e-8dac-7ef1b3845fa4 unlocked, ref=0
[ 198.229803] [drm] Pid 948 closed device
[ 198.241056] [drm] bitstream 1cdede23-0755-458e-8dac-7ef1b3845fa4 unlocked, ref=0
[ 198.241059] [drm] Pid 948 closed device
[ 198.252434] [drm] Pid 948 closed device
The XRT prints can be eliminated by running echo 6 > /proc/sys/kernel/printk
before launching the application.