Known Issues with Epoll

Onload User Guide (UG1586)

Document ID
UG1586
Release Date
2023-07-31
Revision
1.2 English

Onload supports different implementations of epoll controlled by the EF_UL_EPOLL environment variable - see Multiplexed I/O for configuration details.

There are various limitations and differences in Onload vs. kernel behavior - refer to Multiplexed I/O for details.

  • When using EF_UL_EPOLL=1 or 3, it has been identified that the behavior of epoll_wait() differs from the kernel when the EPOLLONESHOT event is requested, resulting in two ‘wakeups’ being observed, one from the kernel and one from Onload. This behavior is apparent on SOCK_DGRAM and SOCK_STREAM sockets for all combinations of EPOLLONESHOT, EPOLLIN and EPOLLOUT events. This applies for all types of accelerated sockets. EF_EPOLL_CTL_FAST is enabled by default and this modifies the semantics of epoll. In particular, it buffers up calls to epoll_ctl() and only applies them when epoll_wait() is called. This can break applications that do epoll_wait() in one thread and epoll_ctl() in another thread. The issue only affects EF_UL_EPOLL=2 and the solution is to set EF_EPOLL_CTL_FAST=0 if this is a problem. The described condition does not occur if EF_UL_EPOLL=1 or EF_UL_EPOLL=3.
  • When EF_EPOLL_CTL_FAST is enabled and an application is testing the readiness of an epoll file descriptor without actually calling epoll_wait(), for example by doing epoll within epoll() or epoll within select(), if one thread is calling select() or epoll_wait() and another thread is doing epoll_ctl(), then EF_EPOLL_CTL_FAST should be disabled. This applies when using EF_UL_EPOLL 1, 2 or 3.

    If the application is monitoring the state of the epoll file descriptor indirectly, for example by monitoring the epoll fd with poll, then EF_EPOLL_CTL_FAST can cause issues and should be set to zero.

    To force Onload to follow the kernel behavior when using the epoll_wait() call, the following variables should be set:

    EF_UL_EPOLL=2

    EF_EPOLL_CTL_FAST=0

    EF_EPOLL_CTL_HANDOFF=0 (when using EF_UL_EPOLL=1)

  • A socket should be removed from an epoll set only when all references to the socket are closed.

    With EF_UL_EPOLL=1 (default) or EF_UL_EPOLL=3, a socket is removed from the epoll set if the file descriptor is closed, even if other references to the socket exist. This can cause problems if file descriptors are duplicated using dup(), dup2() or fork(). For example:

    s = socket();
    s2 = dup(s);
    epoll_ctl(epoll_fd, EPOLL_CTL_ADD, s, ...);
    close(s);  /* socket referenced by s is removed from epoll set when using onload */
    Workaround is set EF_UL_EPOLL=2.
  • When Onload is unable to accelerate a connected socket, for example because no route to the destination exists which uses a Solarflare interface, the socket will be handed off to the kernel and is removed from the epoll set. Because the socket is no longer in the epoll set, attempts to modify the socket with epoll_ctl() will fail with the ENOENT (descriptor not present) error. The described condition does not occur if EF_UL_EPOLL=1 or 3.
  • If an epoll file descriptor is passed to the read() or write() functions these will return a different errorcode than that reported by the kernel stack. This issue exists for all implementations of epoll.
  • When EPOLLET is used and the event is ready, epoll_wait() is triggered by ANY event on the socket instead of the requested event. This issue should not affect application correctness.
  • Users should be aware that if a server is overclocked the epoll_wait() timeout value will increase as CPU MHz increases resulting in unexpected timeout values. This has been observed on Intel based systems and when the Onload epoll implementation is EF_UL_EPOLL=1 or 3. Using EF_UL_EPOLL=2 this behavior is not observed.
  • On a spinning thread, if epoll acceleration is disabled by setting EF_UL_EPOLL=0, sockets on this thread will be handed off to the kernel, but latency will be worse than expected kernel socket latency.
  • To ensure that non-accelerated file descriptors are checked in poll and select functions, the following options should be disabled (set to zero):
  • EF_SELECT_FAST and EF_POLL_FAST
  • When using poll() and select() calls, to ensure that non-accelerated file descriptors are checked when there are no events on any accelerated descriptors, set the following options:
  • EF_POLL_FAST_USEC and EF_SELECT_FAST_USEC, setting both to zero.