Kernel Features - 2023.2 English

Vitis Tutorials: Hardware Acceleration (XD099)

Document ID
XD099
Release Date
2023-11-13
Version
2023.2 English

The krnl_aes kernel includes four Aes modules, each of which are connected outside with AXI-Stream slave and master ports. There is also an AXI slave port for kernel control. To simplify the design, all Aes modules use the same settings (encrypt/decrypt mode, key length, and key value).

The krnl_aes actually uses two clocks inside: one for external AXI ports and the other for internal AXI ports and AES cores. The AES cores run at a higher frequency than the platform AXI interconnections.

There are two methods to get the higher clock for the kernel. One method is to use the secondary platform clock provided, namely the ap_clk_2 ports. In the Alveo Data Center accelerator card, the target platform has provided multiple clocks for the user kernel. For example, for the Alveo U200 xilinx_u200_xdma_201830_2 platform, two system clocks are provided, ap_clk and ap_clk_2, whose default frequency is 300 MHz and 500 Mhz, respectively. The frequency can be configured in the Vitis v++ link process. ap_clk_2 is generated by a standalone MMCM in the static region of the Alveo platform. The second method to get the additional clock is to manually instantiate an MMCM inside the RTL kernel. This might provide additional flexibility for some specific requirements.

The example design provided here uses the second method to generate the required clock. The krnl_aes kernel includes a customized MMCM module to generate a 400 MHz clock from the standard 300 MHz input clock provided by the platform.

The AMD UltraScale+™ Alveo target platform is divided into a static region and a dynamic region. The customer-instantiated MMCMs in the dynamic region are probably driven by an MMCM in the static region, which might be used to drive the platform bus clock. This usually causes big clock skew and makes it difficult for the synchronous design to meet timing. To ease the timing closure, those modules driven by the customer MMCM should operate in asynchronous mode to the platform bus clock. In this example design, the AXI slave control module and the four AES engines all run in the 400 MHz clock domain, while the kernel will be connected to 300 MHz standard platform clock domain. So, altogether nine AXI/AXIS clock converter IPs are used in the top-level of the kernel: one AXI clock converter for AXI control slave, four AXIS clock converters for AXIS slave ports, and four AXIS clock converters for AXIS master ports.

The following is the block diagram of krnl_aes kernel.

clk_gen
(MMCM)
clk_gen...
AXI / AXIS Interface
AXI / AXIS Interface
Kernel Clock Domain
Kernel Clock Domain
Platform AXI Clock Domain
Platform AXI Clock D...
axi_ctrl_slave
axi_ctrl_slave
AXIS clock
converters
AXIS clo...
AXI clock
converters
AXI clockconv...
knrl_aes
knrl_aes
Aes_0
Aes_0
aes_wrapper
aes_wrapper
Aes_1
Aes_1
Aes_2
Aes_2
Aes_3
Aes_3
Viewer does not support full SVG 1.1

There are three kernel execution models for Vitis acceleration application supported by XRT: ap_ctrl_none, ap_ctrl_hs and ap_ctrl_chain. You can refer to ug1416, Supported Kernel Execution Models for more details. This RTL kernel krnl_aes is the mixing of ap_ctrl_none and ap_ctrl_hs modes: ap_ctrl_hs is used for AES key expansion operation, namely the host will start and wait for the finish of AES key expansion operation; ap_ctrl_none is used for general AES encryption/decryption operation, namely as soon as the kernel receives input data from AXI-Stream slave port, it will start and finish the encryption/decryption operation automatically and send the output data to AXI stream master port.

In krnl_aes, following control signal waveform is implemented for ap_ctrl_hs mode. XRT will assert ap_start signal to start the kernel execution, then ap_start will keep high and de-asserted by ap_ready signal. ap_ready is actually a copy of ap_done signal. Finally ap_done is cleared by read control register operation on AXI slave port. Please note that the XRT scheduler actually decides when to assert ap_start depending on the status of ap_start, that is, when XRT detects the ap_start is de-assert, it considers the kernel ready to receive new ap_start request.

ap_ctrl_hs mode

The following table lists all the control register and kernel arguments included in AXI slave port, no interrupt support in this kernel.

Name Addr Offset Width (bits) Description
CTRL 0x000 5 Control Signals.
bit 0 - ap_start
bit 1 - ap_done
bit 2 - ap_idle
bit 3 - ap_ready , copy version of ap_done in ap_ctrl_hs mode
bit 4 - ap_continue , no use in ap_ctrl_hs mode
MODE 0x010 1 Kernel cipher mode:
0 - decryption
1 - encryption
KEY_LEN 0x018 2 AES Key length:
2'b00=128bit
2'b01=192bit
2'b10=256bit
STATUS 0x020 4 Status of the 4 AES engines:
0 - idle
1 - busy
KEY_W7 0x028 32 AES key word 7
KEY_W6 0x030 32 AES key word 6
KEY_W5 0x038 32 AES key word 5
KEY_W4 0x040 32 AES key word 4
KEY_W3 0x048 32 AES key word 3. When key length is 128, this word is not used
KEY_W2 0x050 32 AES key word 2. When key length is 128, this word is not used
KEY_W1 0x058 32 AES key word 1. When key length is 128/192bit, this word is not used
KEY_W0 0x060 32 AES key word 0. When key length is 128/192bit, this word is not used