The VCU encoder and decoder operate in slice mode for low-latency use cases. An input frame is divided into multiple slices (8 or 16) horizontally, and the encoder generates a slice_done interrupt at the end of every slice. Generated NAL unit data can be passed to a downstream element immediately without waiting for the whole frame to be encoded. The VCU decoder also starts processing data as soon as one slice of data is ready in the decoder circular buffer instead of waiting for complete frame data. The hardware Sync IP shown in the block diagram is responsible for synchronizing the AXI read/writes between capture DMA and the VCU encoder.
Capture DMA writes video buffers in raster scan order. The Sync IP core monitors the buffer level while capture DMA is writing into DRAM, and allows the encoder to read input buffer data, if the requested data is already written by DMA; otherwise, it blocks encoder AXI transactions until the Capture DMA completes its writes to that section.
On the decoder side, the VCU decoder writes decoded video buffers into DRAM in block raster scan order, and the display reads the data in raster scan order. The software ensures a phase difference of ~frame_period/2 between the VCU decoder start and display read so that the decoder is ahead of the display. This is achieved by releasing decoded buffers early to display stack and waiting 1/2 frame duration at base-sink/kmssink before setting the plane for display.