In the Linux kernel, the following vulnerability has been resolved:
locking/qrwlock: Fix ordering in queuedwritelock_slowpath()
While this code is executed with the waitlock held, a reader can acquire the lock without holding waitlock. The writer side loops checking the value with the atomiccondread_acquire(), but only truly acquires the lock when the compare-and-exchange is completed successfully which isn’t ordered. This exposes the window between the acquire and the cmpxchg to an A-B-A problem which allows reads following the lock acquisition to observe values speculatively before the write lock is truly acquired.
We've seen a problem in epoll where the reader does a xchg while holding the read lock, but the writer can see a value change out from under it.
Writer | Reader
epscanreadylist() | |- writelockirq() | |- queuedwritelockslowpath() | |- atomiccondreadacquire() | | readlockirqsave(&ep->lock, flags); --> (observes value before unlock) | chainepilockless() | | epi->next = xchg(&ep->ovflist, epi); | | readunlockirqrestore(&ep->lock, flags); | | | atomiccmpxchgrelaxed() | |-- READONCE(ep->ovflist); |
A core can order the read of the ovflist ahead of the atomiccmpxchgrelaxed(). Switching the cmpxchg to use acquire semantics addresses this issue at which point the atomiccondread can be switched to use relaxed semantics.
[peterz: use try_cmpxchg()]