DMA Interleaved Template Requirements

Linux Drivers

Release Date
2023-07-22

struct dma_interleaved_template:

src_sgl = false dst_sgl = true numf = frame_size = < 1 or 2 depending on whether this is describing a packed or semi-planar format> sgl =

struct data_chunk:

sgl[0].size = sgl[0].icg = < number of non-data bytes within a row of image data; padding> sgl[0].dst_sgl = Below is a code example for semi-planar YUV 422 (i.e. NV16) demonstrating how steps 1 and 2 of the above code snippet change in such a case:
/* Step 1 - Configure the dma channel to write out semi-planar YUV 422 */
xilinx_xdma_v4l2_config(frmbuf_dma, V4L2_PIX_FMT_NV16M);
/* use xilinx_xdma_drm_config with DRM_FORMAT_NV16 */

/* Step 2 - Describe the buffer attributes for a 1080p frame */
dma_tmplt.dir = DMA_DEV_TO_MEM; /* use DMA_MEM_TO_DEV for Framebuffer Read */
dma_tmplt.src_sgl = false;
dma_tmplt.dst_sgl = true;
dma_tmplt.dst_start = luma_addr;
dma_tmplt.frame_size = 2; /* two plane pixel format */
dma_tmplt.numf = 1080; /* height of luma frame */
 
dma_tmplt.sgl[0].size = 1920; /* 1 byte/pixel x 1920 pixels for Y plane */
dma_tmplt.sgl[0].icg = 0;
 
frame_height = dma_tmplt.numf;
stride = dma_tmplt.sgl[0].size + dma_tmplt.sgl[0].icg;
 
dma_tmplt.sql[0].dst_icg = chroma_addr - luma_addr - (frame_height * stride);

Driver Operation

The Framebuffer driver manages buffer descriptors in software keeping them in one of four possible states in the following order:
  1. pending
  2. staged
  3. active
  4. done
When a DMA client calls dma_commit(), the buffer descriptor is placed in the driver’s “pending” queue. Multiple buffers can be queued in this manner by the DMA client before proceeding to the next step (see step 4 of Interfacing with the Video Framebuffer Driver from DMA Clients). When dma_async_issue_pending() is called (step 5 in the client code sample above), the driver begins processing all queued buffers on the “pending” list. A buffer is picked from the pending list and then stored as “staged”. At this moment, driver programs the registers with data provided within the “staged” buffer descriptor. During normal processing (i.e. all frames except the first frame*), these values will not become active until the currently processed frame completes. As such, there is a one-frame delay between programming and the actual writing data to memory. Hence the term “staged” to describe this part of the buffer lifecycle. When the currently active frame completed, the buffer descriptor is classified as “active” in the driver. At this point, a new descriptor is picked from the pending list and this new buffer is marked as “staged” with its values programmed into the IP registers as described earlier. The buffer marked “active” represents the data currently being written to memory. Other than being held in the “active” state, no other action is taken with the buffer When the active frame completes, it is moved to the “done” list. The driver utilizes a tasklet which is called at the end of the frame interrupt handler. The tasklet will process any buffer descriptors on the done list by removing them from the list and calling any callback the client has linked to the descriptor. This completes the life cycle of a buffer descriptor. As can be seen, with four possible states, it is best to allocate at least four buffers to maintain consistent frame processing. Fewer buffers will result in gaps within the pipeline and result in frame data within a given buffer being overwritten one or more times (depending on how few buffers are queued and the number of resulting gaps in the driver’s buffer pipeline).

Buffer Alignment

The driver expects the buffer to be aligned to at least 8 * bytes. For e.g. if pixels per clock is 2 then the buffer has to be at least 16 byte aligned. In case some other system component, like VCU, mandates the buffer should be aligned to higher value, e.g. 32 byte aligned, the user is expected to set this manually in the device tree using xlnx,dma-align dt property. Refer to the device tree bindings doc for details.
  • Note: normally, registers programmed while the IP is running will not take effect until the next frame. The very first frame, however, is an exception: the IP is not yet running and, as such, the values take effect immediately. Nevertheless, there is no additional special treatment given the first frame buffer. As such, it will be written to, in effect, twice.