In the Linux kernel, the following vulnerability has been resolved: IB/ipoib: Fix mcast list locking Releasing the priv->lock
while iterating the priv->multicast_list
in ipoib_mcast_join_task()
opens a window for ipoib_mcast_dev_flush()
to remove the items while in the middle of iteration. If the mcast is removed while the lock was dropped, the for loop spins forever resulting in a hard lockup (as was reported on RHEL 4.18.0-372.75.1.el86 kernel): Task A (kworker/u72:2 below) | Task B (kworker/u72:0 below) -----------------------------------+----------------------------------- ipoibmcastjointask(work) | ipoibibdevflushlight(work) spinlockirq(&priv->lock) | _ipoibibdevflush(priv, ...) listforeachentry(mcast, | ipoibmcastdevflush(dev = priv->dev) &priv->multicastlist, list) | ipoibmcastjoin(dev, mcast) | spinunlockirq(&priv->lock) | | spinlockirqsave(&priv->lock, flags) | listforeachentrysafe(mcast, tmcast, | &priv->multicastlist, list) | listdel(&mcast->list); | listaddtail(&mcast->list, &removelist) | spinunlockirqrestore(&priv->lock, flags) spinlockirq(&priv->lock) | | ipoibmcastremovelist(&removelist) (Here, mcast
is no longer on the | listforeachentrysafe(mcast, tmcast, priv->multicast_list
and we keep | removelist, list) spinning on the remove_list
of | >>> waitforcompletion(&mcast->done) the other thread which is blocked | and the list is still valid on | it's stack.) Fix this by keeping the lock held and changing to GFPATOMIC to prevent eventual sleeps. Unfortunately we could not reproduce the lockup and confirm this fix but based on the code review I think this fix should address such lockups. crash> bc 31 PID: 747 TASK: ff1c6a1a007e8000 CPU: 31 COMMAND: "kworker/u72:2" -- [exception RIP: ipoibmcastjointask+0x1b1] RIP: ffffffffc0944ac1 RSP: ff646f199a8c7e00 RFLAGS: 00000002 RAX: 0000000000000000 RBX: ff1c6a1a04dc82f8 RCX: 0000000000000000 work (&priv->mcasttask{,.work}) RDX: ff1c6a192d60ac68 RSI: 0000000000000286 RDI: ff1c6a1a04dc8000 &mcast->list RBP: ff646f199a8c7e90 R8: ff1c699980019420 R9: ff1c6a1920c9a000 R10: ff646f199a8c7e00 R11: ff1c6a191a7d9800 R12: ff1c6a192d60ac00 mcast R13: ff1c6a1d82200000 R14: ff1c6a1a04dc8000 R15: ff1c6a1a04dc82d8 dev priv (&priv->lock) &priv->multicastlist (aka head) ORIGRAX: ffffffffffffffff CS: 0010 SS: 0018 --- <NMI exception stack> --- #5 [ff646f199a8c7e00] ipoibmcastjointask+0x1b1 at ffffffffc0944ac1 [ibipoib] #6 [ff646f199a8c7e98] processonework+0x1a7 at ffffffff9bf10967 crash> rx ff646f199a8c7e68 ff646f199a8c7e68: ff1c6a1a04dc82f8 <<< work = &priv->mcasttask.work crash> list -hO ipoibdevpriv.multicastlist ff1c6a1a04dc8000 (empty) crash> ipoibdevpriv.mcasttask.work.func,mcastmutex.owner.counter ff1c6a1a04dc8000 mcasttask.work.func = 0xffffffffc0944910 <ipoib_mcast_join_task>, mcastmutex.owner.counter = 0xff1c69998efec000 crash> b 8 PID: 8 TASK: ff1c69998efec000 CPU: 33 COMMAND: "kworker/u72:0" -- #3 [ff646f1980153d50] waitforcompletion+0x96 at ffffffff9c7d7646 #4 [ff646f1980153d90] ipoibmcastremovelist+0x56 at ffffffffc0944dc6 [ibipoib] #5 [ff646f1980153de8] ipoibmcastdevflush+0x1a7 at ffffffffc09455a7 [ibipoib] #6 [ff646f1980153e58] _ipoibibdevflush+0x1a4 at ffffffffc09431a4 [ib_ipoib] #7 [ff ---truncated---