Low Latency and AMD Low Latency

Multimedia User Guide (UG1449)

Document ID
UG1449
Release Date
2023-10-19
Revision
1.7 English

For a real-time experience, use low-latency and AMD low-latency use cases in video conferencing and broadcasting.

Low-Latency

The frame is divided into multiple slices; the VCU encoder output and decoder input are processed in slice mode. The VCU Encoder input and Decoder output still works in frame mode. The VCU encoder generates a slice done interrupt at every end of the slice and outputs stream buffer for slice, and is available immediately for next element processing. Therefore, with multiple slices it is possible to reduce VCU processing latency from one frame to one-frame/num-slices. In the low-latency mode, a maximum of four streams for the encoder and two streams for the decoder can be run.

The Low-Latency 4kp60 HEVC streaming pipeline is as follows:

  • Stream Out:

    Use the following pipeline to stream-out (capture > encode > stream-out) NV12 video using a low-latency GStreamer pipeline. This pipeline demonstrates how to stream-out low-latency-encoded video from one device (server) to another device (client) on the same network. The pipeline is encoded with the NV12 color format, and the H265 video format. Video stream resolution is 4kp with 60fps and bitrate is 25 Mbps. It sends the video stream to the client host device with an IP Address 192.168.25.89 on port 5004.

    gst-launch-1.0 -v v4l2src device=/dev/video0 io-mode=4 ! 
    video/x-raw,format=NV12,width=3840,height=2160,framerate=60/1 ! 
    omxh265enc num-slices=8 periodicity-idr=240 cpb-size=500 
    gdr-mode=horizontal initial-delay=250 control-rate=low-latency 
    prefetch-buffer=true target-bitrate=25000 gop-mode=low-delay-p ! 
    video/x-h265, alignment=nal ! rtph265pay ! 
    udpsink buffer-size=60000000 host=192.168.25.89 port=5004 async=false 
    max-lateness=-1 qos-dscp=60 max-bitrate=120000000 -v
    Important: Replace the host IP address with the IP of the client device.
  • Stream In:

    Use the following pipeline to stream-in (stream-in > decode > display) NV12 video using a low-latency GStreamer pipeline. This pipeline demonstrates how low-latency stream-in data is decoded and displayed on the client device. The pipeline states that the encoded video format is H265, and streams-in on the client device on port 5004 - over UDP protocol.

    gst-launch-1.0 udpsrc port=5004 buffer-size=60000000 
    caps="application/x-rtp, media=video, clock-rate=90000, 
    payload=96, encoding-name=H265" ! rtpjitterbuffer latency=7 
    ! rtph265depay ! h265parse ! video/x-h265, alignment=nal 
    ! omxh265dec low-latency=1 internal-entropy-buffers=5 
    ! video/x-raw ! queue max-size-bytes=0 ! fpsdisplaysink 
    name=fpssink text-overlay=false 
    'video-sink=kmssink bus-id=a0070000.v_mix hold-extra-sample=1 
    show-preroll-frame=false sync=true ' sync=true -v 

AMD Low-Latency

AMD Low-Latency: In the low-latency mode, the VCU encoder and decoder work at subframe or slice level boundary but other components at the input of encoder and output of decoder namely capture DMA and display DMA still work at frame level boundary. This means that the encoder can read input data only when capture has completed writing the full frame.

In the AMD low-latency mode, capture and display also work at subframe level and therefore reduce the pipeline latency significantly. This is made possible by making the producer (Capture DMA) and the consumer (VCU encoder) work on the same input buffer concurrently, but maintaining the synchronization between the two such that consumer read request is unblocked only when the producer is done writing the data required for that read request.

  • Stream Out

    Use the following pipeline to stream-out (capture > encode > stream-out) NV12 video using AMD ultra low-latency GStreamer pipeline. This pipeline demonstrates how to stream-out AMD ultra low-latency encoded video from one device (server) to another device (client) on the same network. The pipeline is encoded with NV12 color format, and H265 video format. Video stream resolution is 4kp with 60fps, and bitrate is 25 Mbps. It sends video stream to client host device with an IP Address 192.168.25.89 on port 5004.

    gst-launch-1.0 -v v4l2src device=/dev/video0 
    io-mode=4 ! video/x-raw\(memory:XLNXLL\), 
    format=NV12,width=3840,height=2160,framerate=60/1 
    ! omxh265enc num-slices=8 periodicity-idr=240 
    cpb-size=500 gdr-mode=horizontal initial-delay=250 
    control-rate=low-latency prefetch-buffer=true 
    target-bitrate=25000 gop-mode=low-delay-p 
    ! video/x-h265, alignment=nal ! rtph265pay ! 
    udpsink buffer-size=60000000 host=192.168.25.89 
    port=5004 async=false max-lateness=-1 
    qos-dscp=60 max-bitrate=120000000 -v
    Important: Replace the host IP address with the IP of the client device.
  • Stream In

    Use the following pipeline to stream-in (stream-in > decode > display) NV12 video using AMD ultra low-latency GStreamer pipeline. The pipeline demonstrates how AMD ultra low-latency stream-in data is decoded and displayed on the client device. The pipeline states that the encoded video format is H265 and streams-in on the client device on port 5004 - over UDP protocol.

    gst-launch-1.0 udpsrc port=5004 buffer-size=60000000 
    caps="application/x-rtp, media=video, clock-rate=90000, 
    payload=96, encoding-name=H265" ! rtpjitterbuffer latency=7 
    ! rtph265depay ! h265parse ! video/x-h265, alignment=nal 
    ! omxh265dec low-latency=1 internal-entropy-buffers=5 
    ! video/x-raw\(memory:XLNXLL\) ! queue max-size-bytes=0 
    ! fpsdisplaysink name=fpssink text-overlay=false 
    'video-sink=kmssink bus-id=a0070000.v_mix hold-extra-sample=1 
    show-preroll-frame=false sync=true ' sync=true -v

For more details, see H.264/H.265 Video Codec Unit LogiCORE IP Product Guide (PG252).