Completion Status/Interrupt Moderation - 4.0 English

QDMA Subsystem for PCI Express Product Guide (PG302)

Document ID
PG302
Release Date
2022-05-20
Version
4.0 English

The QDMA Subsystem for PCIe provides a means to moderate the Completion interrupts and Completion Status writes on a per queue basis. The software can select one out of five modes for each queue. The selected mode for a queue is stored in the QDMA Subsystem for PCIe in the Completion ring context for that queue. After a mode has been selected for a queue, the driver can always select another mode when it sends the completion ring CIDX update to the QDMA.

The Completion interrupt moderation is handled by the Completion engine. The Completion engine stores the Completion ring contexts of all the queues. It is possible to individually enable or disable the sending of interrupts and Completion Statuses for every queue and this information is present in the Completion ring context. It is worth mentioning that the modes being described here moderate not only interrupts but also Completion Status writes. Also, since interrupts and Completion Status writes can be individually enabled/disabled for each queue, these modes will work only if the interrupt/Completion Status is enabled in the Completion context for that queue.

The QDMA Subsystem for PCIe keeps only one interrupt outstanding per queue. This policy is enforced by QDMA even if all other conditions to send an interrupt have been met for the mode. The way the QDMA Subsystem for PCIe considers an interrupt serviced is by receiving a CIDX update for that queue from the driver.

The basic policy followed in all the interrupt moderation modes is that when there is no interrupt outstanding for a queue, the QDMA Subsystem for PCIe keeps monitoring the trigger conditions to be met for that mode. Once the conditions are met, an interrupt is sent out. While the QDMA subsystem is waiting for the interrupt to be served, it remains sensitive to interrupt conditions being met and remembers them. When the CIDX update is received, the QDMA subsystem evaluates whether the conditions are still being met. If they are still being met, another interrupt is sent out. If they are not met, no interrupt is sent out and the QDMA resumes monitoring for the conditions to be met again.

Note that the interrupt moderation modes that the QDMA subsystem provides are not necessarily precise. Thus, if the user application sends two CMPT packets with an indication to send an interrupt, it is not necessary that two interrupts will be generated. The main reason for this behavior is that when the driver is interrupted to read the Completion ring, and it is under no obligation to read exactly up to the Completion for which the interrupt was generated. Thus, the driver may not read up to the interrupting Completion, or it may even read beyond the interrupting Completion descriptor if there are valid descriptors to be read there. This behavior requires the QDMA Subsystem for PCIe to re-evaluate the trigger conditions every time it receives the CIDX update from the driver.

The detailed description of each mode is given below:

TRIGGER_EVERY
This mode is the most aggressive in terms of interruption frequency. The idea behind this mode is to send an interrupt whenever the completion engine determines that an unread completion descriptor is present in the Completion ring.
TRIGGER_USER
The QDMA Subsystem for PCIe provides a way to send a CMPT packet to the subsystem with an indication to send out an interrupt when the subsystem is done sending the packet to the host. This allows the user application to perform interrupt moderation when the TRIGGER_USER mode is set.
TRIGGER_USER_COUNT
This mode allows the QDMA Subsystem for PCIe is sensitive to either of two triggers. One of these triggers is sent by the user along with the CMPT packet. The other trigger is the presence of more than a programmed threshold of unread Completion entries in the Completion Ring, as seen by the hardware. This threshold is driver programmable on a per-queue basis. The QDMA evaluates whether or not to send an interrupt when either of these triggers is detected. As explained in the preceding sections, other conditions must be satisfied in addition to the triggers for an interrupt to be sent.
TRIGGER_USER_TIMER
In this mode, the QDMA Subsystem for PCIe is sensitive to either of two triggers. One of these triggers is sent by the user along with the CMPT packet. The other trigger is the expiration of the timer that is associated with the CMPT queue. The period of the timer is driver programmable on a per-queue basis. The QDMA evaluates whether or not to send an interrupt when either of these triggers is detected. As explained in the preceding sections, other conditions must be satisfied in addition to the triggers for an interrupt to be sent. For more information, see Completion Timer.
TRIGGER_USER_TIMER_COUNT
This mode allows the QDMA Subsystem for PCIe is sensitive to any of three triggers. The first trigger is sent by the user along with the CMPT packet. The second trigger is the expiration of the timer that is associated with the CMPT queue. The period of the timer is driver programmable on a per-queue basis. The third trigger is the presence of more than a programmed threshold of unread Completion entries in the Completion Ring, as seen by the hardware. This threshold is driver programmable on a per-queue basis. The QDMA evaluates whether or not to send an interrupt when any of these triggers is detected. As explained in the preceding sections, other conditions must be satisfied in addition to the triggers for an interrupt to be sent.
TRIGGER_DIS
In this mode, the QDMA Subsystem for PCIe does not send Completion interrupts in spite of them being enabled for a given queue. The only way that the driver can read the Completion ring in this case is when it regularly polls the ring. The driver will have to make use of the color bit feature provided in the Completion ring when this mode is set as this mode also disables the sending of any Completion Status descriptors to the Completion ring.

When a queue is programmed in TRIGGER_USER_TIMER_COUNT mode, the software can choose to not read all the Completion entries available in the Completion ring as indicated by an interrupt (or a Completion Status write). In such a case, the software can give a Completion CIDX update for the partial read. This works because the QDMA will restart the timer upon reception of the CIDX update and once the timer expires, another interrupt will be generated. This process will repeat until all the Completion entries have been read.

However, in the TRIGGER_EVERY, TRIGGER_USER and TRIGGER_USER_COUNT modes, an interrupt is sent, if at all, as a result of a Completion packet being received by the QDMA from the user logic. For every request by the user logic to send an interrupt, the QDMA sends one and only one interrupt. Thus in this case, if the software does not read all the Completion entries available to be read and the user logic does not send any more Completions requesting interrupts, the QDMA does not generate any more interrupts. This results in the residual Completions sitting in the Completion ring indefinitely. To avoid this from happening, when in TRIGGER_EVERY, TRIGGER_USER and TRIGGER_USER_COUNT mode, the software must read all the Completion entries in the Completion ring as indicated by an interrupt (or a Completion Status write).

The following are the flowcharts of different modes. These flowcharts are from the point of view of the Completion Engine. The Completion packets come in from the user logic and are written to the Completion Ring. The software (SW) update refers to the Completion Ring CIDX update sent from software to hardware.

Figure 1. Flowchart for EVERY Mode
Figure 2. Flowchart for USER Mode
Figure 3. Flowchart for USER_COUNT Mode
Figure 4. Flowchart for USER_TIMER Mode
Figure 5. Flowchart for USER_TIMER_COUNT Mode