Stack Contention - Deferred Work

Onload User Guide (UG1586)

Document ID
UG1586
Release Date
2023-07-31
Revision
1.2 English

When multiple threads share an Onload stack, the ability for one thread to defer sending tasks to another thread that is currently holding the stack lock, can mitigate the effects of lock contention. When sending data, contention occurs when one thread calls send(), while another thread holds the stack lock. The task of sending the data can be deferred to the lock holder - freeing the deferring thread to continue with other work. However a send() which also processes a lot of deferred work will take longer to execute - preventing other threads from getting the stack lock.

A thread which calls send() when the stack EF_DEFER_WORK_LIMIT has been reached cannot defer further work to the lock holder, but is forced to block and wait for the stack lock. The defer_work_limited counter identifies the number of these occurrences.

onload_stackdump per-socket counters will indicate the level of deferred work on each socket within a stack. For example:

TCP 2:10 lcl=172.16.20.123:4112 rmt=172.16.20.88:4112 ESTABLISHED  snd: limited rwnd=17 cwnd=129 nagle=0 more=0 app=103905  tx: defer=48799 nomac=0 warm=0 warm_aborted=0

onload_stackdump per-stack counters also indicate the level of lock contention:

  • deferred_work - the number packets sent using the deferred mechanism.
  • defer_work_limited - the number of times a sending thread is prevented from deferring to the stack lock holder because the EF_DEFER_WORK_LIMIT has been reached.
  • deferred_polls - a thread is prevented from polling the stack when another thread has the stack lock. The poll is deferred to the lock holder. The lock holder will place any ready received data on the correct socket queues and wake other threads if there is work to be done.

Solutions

To reduce the level of stack lock contention, the following actions are recommended:

  • For affected stacks, reduce the number of threads performing network I/O.
  • Applications with fewer threads can use a stack for each thread - see EF_STACK_PER_THREAD.
  • Bind critical sockets to selected stacks - see Stacks API.
  • For TCP connections, use onload_move_fd() to place sockets accepted from a listening socket into multiple stacks.

For more information see Minimizing Lock Contention.