Epoll

Onload User Guide (UG1586)

Document ID
UG1586
Release Date
2023-07-31
Revision
1.2 English

The epoll set of functions, epoll_create(), epoll_ctl(), epoll_wait(), epoll_pwait(), are accelerated in the same way as poll and select. The environment variable EF_UL_EPOLL enables/disables epoll acceleration. Refer to the release change log for enhancements and changes to epoll behavior.

Using Onload an epoll set can consist of both Onload file descriptors and kernel file descriptors. Onload supports the following options for the EF_UL_EPOLL environment variable:

Table 1. Options for the EF_UL_EPOLL Variable
Value Epoll Behavior
0

Accelerated epoll is disabled and epoll_ctl(), epoll_wait() and epoll_pwait() function calls are processed in the kernel. Other functions calls such as send() and recv() are still accelerated.

Interrupt avoidance does not function and spinning cannot be enabled.

If a socket is handed over to the kernel stack after it has been added to an epoll set, it will be dropped from the epoll set.

onload_ordered_epoll_wait() is not supported.

1

Function calls to epoll_ctl(), epoll_wait(), epoll_pwait() are processed at user level.

Delivers best latency except when the number of accelerated file descriptors in the epoll set is very large. This option also gives the best acceleration of epoll_ctl() calls.

Spinning can be enabled and interrupts are avoided until an application blocks.

CPU overhead and latency increase with the number of file descriptors in the epoll set.

onload_ordered_epoll_wait() is supported.

2

Calls to epoll_ctl(), epoll_wait(), epoll_pwait() are processed in the kernel.

Delivers best performance for large numbers of accelerated file descriptors.

Spinning can be enabled and interrupts are avoided until an application blocks.

CPU overhead and latency are independent of the number of file descriptors in the epoll set.

onload_ordered_epoll_wait() is not supported.

3

Function calls to epoll_ctl(), epoll_wait(), epoll_pwait() are processed at user level.

Delivers best acceleration latency for epoll_ctl() calls and scales well when the number of accelerated file descriptors in the epoll set is very large - and all sockets are in the same stack. The cost of the epoll_wait() is independent of the number of accelerated file descriptors in the set and depends only on the number of descriptors that become ready.

The benefits will be less if sockets exist in different Onload stacks:

  • From Onload 201805 onwards, each socket can be in up to four epoll sets at a time, provided that each epoll set is in a different process
  • Otherwise, each socket can be in at most one epoll set at a time.

In such cases the recommendation is to use EF_UL_EPOLL=2.

EF_UL_EPOLL=3 does not allow monitoring the readiness of the epoll file descriptors from another epoll/poll/select.

EF_UL_EPOLL=3 cannot support epoll sets which exist across fork().

Spinning can be enabled and interrupts are avoided until an application blocks.

onload_ordered_epoll_wait() is supported.

The relative performance of epoll options 1 and 2 depends on the details of application behavior as well as the number of accelerated file descriptors in the epoll set. Behavior can also differ between earlier and later kernels and between Linux realtime and non-realtime kernels. Generally the OS will allocate short time slices to a user-level CPU intensive application which might result in performance (latency spikes). A kernel-level CPU intensive process is less likely to be de-scheduled resulting in better performance. You are recommended to evaluate options 1 and 2 for applications that manages many file descriptors, or to try option 3 (onload-201502 and later) when using very large sets and all sockets are in the same stack.

Additional environment variables can be employed to control the epoll_ctl(), epoll_wait() and epoll_pwait() functions and to give priority to accelerated sockets over non-accelerated sockets and other file descriptors. Refer to EF_EPOLL_CTL_FAST, EF_EPOLL_SPIN and EF_EPOLL_MT_SAFE.

Refer also to Known Issues with Epoll.