In the Linux kernel, the following vulnerability has been resolved:
tcp: correct handling of extreme memory squeeze
Testing with iperf3 using the "pasta" protocol splicer has revealed a problem in the way tcp handles window advertising in extreme memory squeeze situations.
Under memory pressure, a socket endpoint may temporarily advertise a zero-sized window, but this is not stored as part of the socket data. The reasoning behind this is that it is considered a temporary setting which shouldn't influence any further calculations.
However, if we happen to stall at an unfortunate value of the current window size, the algorithm selecting a new value will consistently fail to advertise a non-zero window once we have freed up enough memory. This means that this side's notion of the current window size is different from the one last advertised to the peer, causing the latter to not send any data to resolve the sitution.
The problem occurs on the iperf3 server side, and the socket in question is a completely regular socket with the default settings for the fedora40 kernel. We do not use SOPEEK or SORCVBUF on the socket.
The following excerpt of a logging session, with own comments added, shows more in detail what is happening:
// tcpv4rcv(->) // tcprcvestablished(->) [rcvnxt 265600160, rcvwnd 262144, sntack 265469200, winnow 131184] [copiedseq 259909392->260034360 (124968), unread 5565800, qlen 85, ofoq 0] [OFO queue: gap: 65480, len: 0] [tp->rcvwup: 265469200, tp->rcvwnd: 262144, tp->rcvnxt 265600160] [tp->rcvwup: 265469200, tp->rcvwnd: 262144, tp->rcvnxt 265600160] returning 0 // Receive queue is at 85 buffers and we are out of memory. // We drop the incoming buffer, although it is in sequence, and decide // to send an advertisement with a window of zero. // We don't update tp->rcvwnd and tp->rcv_wup accordingly, which means // we unconditionally shrink the window.
[tp->rcv_wup: 265469200, tp->rcv_wnd: 262144, tp->rcv_nxt 265600160]
[rcv_nxt 265600160, rcv_wnd 262144, snt_ack 265469200, win_now 131184]
[copied_seq 260040464->260040464 (0), unread 5559696, qlen 85, ofoq 0]
returning 6104 bytes
// After each read, the algorithm for calculating the new receive // window in _tcpcleanuprbuf() finds it is too small to advertise // or to update tp->rcvwnd. // Meanwhile, the peer thinks the window is zero, and will not send // any more data to trigger an update from the interrupt mode side.
---truncated---