Processing Pipeline - 2020.2 English

A memory-to-memory (M2M) pipeline reads video frames from memory, does certain processing, and then writes the processed frames back into memory A block diagram of the process pipeline is shown in the following figure.

Figure 1. M2M Processing Pipeline Showing Hardware Accelerator and Data Motion Network

There are two accelerators supported in this reference design:

2D convolution filter implemented in PL along with a data mover (DM)
2D convolution filter implemented in AIE along with a data mover DM in PL

The memory-to-memory (m2m) processing pipeline with the 2D convolution filter is generated and integrated by the Vitis™ tool. The C-based 2D filter function is translated to RTL and then packaged as kernel object (.xo) using Vitis™ HLS. The case is the same for the data mover required for the 2D Convolution filter in AIE. The Cardano compiler generates the connectivity graph (.o) with the AIE engine and the program (2D convolution filter elf) to execute on AIE. The Vitis™ tool uses the .xo and .o outputs from these tools and integrates the IPs into the platform.

The data movers read input frames from the memory. The processing block runs convolution on the frame. Convolution is a common image processing technique that changes the intensity of a pixel to reflect the intensities of the surrounding pixels. This is widely used in image filters to achieve popular image effects like blur, sharpen, and edge detection.

The implemented algorithm uses a 3x3 kernel with programmable filter coefficients. The coefficients inside the kernel determine how to transform the pixels from the original image into the pixels of the processed image, as shown in the following figure.

Figure 2. 2D Convolution Filter with a 3x3 Kernel

The algorithm performs a two-dimensional (2D) convolution for each pixel of the input image with a 3x3 kernel. Convolution is the sum of products, one for each coefficient/source pixel pair. As the reference design is using a 3x3 kernel, in this case it is the sum of nine products.

The result of this operation is the new intensity value of the center pixel in the output image. This scheme is repeated for every pixel of the image in raster-scan order, that is, line-by-line from top-left to bottom-right. In total, width x height 2D convolution operations are performed to process the entire image.

The pixel format used in this design is YUYV which is a packed format with 16 bits per pixel. Each pixel can be divided into two 8-bit components: one for luma (Y), the other for chroma (U/V alternating).

In this implementation, only the Y component is processed by the 2D convolution filter which is essentially a grayscale image. The reason is that the human eye is more sensitive to intensity than color. The combined U/Y components which accounts for the color is merged back into the final output image unmodified. The processed frame is then written back to memory.

Note: The 2D filter in the PL has the option of reading coefficients from memory (AXI MM is not shown in the figure). The 2d filter in the AIE only supports fixed coefficients corresponding to a Sobel filter.