In the Linux kernel, the following vulnerability has been resolved:
fscache: Use waitonbit() to wait for the freeing of relinquished volume
The freeing of relinquished volume will wake up the pending volume acquisition by using wakeupbit(), however it is mismatched with waitvarevent() used in fscachewaitonvolumecollision() and it will never wake up the waiter in the wait-queue because these two functions operate on different wait-queues.
According to the implementation in fscachewaitonvolumecollision(), if the wake-up of pending acquisition is delayed longer than 20 seconds (e.g., due to the delay of on-demand fd closing), the first waitvareventtimeout() will timeout and the following waitvar_event() will hang forever as shown below:
FS-Cache: Potential volume collision new=00000024 old=00000022 ...... INFO: task mount:1148 blocked for more than 122 seconds. Not tainted 6.1.0-rc6+ #1 task:mount state:D stack:0 pid:1148 ppid:1 Call Trace: <TASK> _schedule+0x2f6/0xb80 schedule+0x67/0xe0 fscachewaitonvolumecollision.cold+0x80/0x82 _fscacheacquirevolume+0x40d/0x4e0 erofsfscacheregistervolume+0x51/0xe0 [erofs] erofsfscacheregisterfs+0x19c/0x240 [erofs] erofsfcfillsuper+0x746/0xaf0 [erofs] vfsgetsuper+0x7d/0x100 gettreenodev+0x16/0x20 erofsfcgettree+0x20/0x30 [erofs] vfsgettree+0x24/0xb0 pathmount+0x2fa/0xa90 domount+0x7c/0xa0 _x64sysmount+0x8b/0xe0 dosyscall64+0x30/0x60 entrySYSCALL64after_hwframe+0x46/0xb0
Considering that wakeupbit() is more selective, so fix it by using waitonbit() instead of waitvarevent() to wait for the freeing of relinquished volume. In addition because waitqueueactive() is used in wakeupbit() and clearbit() doesn't imply any memory barrier, use clearandwakeupbit() to add the missing memory barrier between cursor->flags and waitqueue_active().