In the Linux kernel, the following vulnerability has been resolved:
bpf, sockmap: Fix race between element replace and close()
Element replace (with a socket different from the one stored) may race with socket's close() link popping & unlinking. _sockmap_delete() unconditionally unrefs the (wrong) element:
// set map[0] = s0 mapupdateelem(map, 0, s0)
// drop fd of s0 close(s0) sockmapclose() locksock(sk) (s0!) sockmapremovelinks(sk) link = skpsocklinkpop() sockmapunlink(sk, link) sockmapdeletefromlink // replace map[0] with s1 mapupdateelem(map, 0, s1) sockmapupdateelem (s1!) locksock(sk) sockmapupdatecommon psock = skpsock(sk) spinlock(&stab->lock) osk = stab->sks[idx] sockmapaddlink(..., &stab->sks[idx]) sockmapunref(osk, &stab->sks[idx]) psock = skpsock(osk) skpsockput(sk, psock) if (refcountdecandtest(&psock)) skpsockdrop(sk, psock) spinunlock(&stab->lock) unlocksock(sk) _sockmapdelete spinlock(&stab->lock) sk = *psk // s1 replaced s0; sk == s1 if (!sktest || sktest == sk) // sktest (s0) != sk (s1); no branch sk = xchg(psk, NULL) if (sk) sockmapunref(sk, psk) // unref s1; sks[idx] will dangle psock = skpsock(sk) skpsockput(sk, psock) if (refcountdecandtest()) skpsockdrop(sk, psock) spinunlock(&stab->lock) releasesock(sk)
Then close(map) enqueues bpfmapfreedeferred, which finally calls sockmapfree(). This results in some refcountt warnings along with a KASAN splat [1].
Fix _sockmapdelete(), do not allow sockmap_unref() on elements that may have been replaced.
Write of size 4 at addr ffff88811f5b9100 by task kworker/u64:12/1063
CPU: 14 UID: 0 PID: 1063 Comm: kworker/u64:12 Not tainted 6.12.0+ #125 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 Workqueue: eventsunbound bpfmapfreedeferred Call Trace: <TASK> dumpstacklvl+0x68/0x90 printreport+0x174/0x4f6 kasanreport+0xb9/0x190 kasancheckrange+0x10f/0x1e0 sockmapfree+0x10e/0x330 bpfmapfreedeferred+0x173/0x320 processonework+0x846/0x1420 workerthread+0x5b3/0xf80 kthread+0x29e/0x360 retfromfork+0x2d/0x70 retfromfork_asm+0x1a/0x30 </TASK>
Allocated by task 1202: kasansavestack+0x1e/0x40 kasansavetrack+0x10/0x30 _kasanslaballoc+0x85/0x90 kmemcacheallocnoprof+0x131/0x450 skprotalloc+0x5b/0x220 skalloc+0x2c/0x870 unixcreate1+0x88/0x8a0 unixcreate+0xc5/0x180 _sockcreate+0x241/0x650 _syssocketpair+0x1ce/0x420 _x64syssocketpair+0x92/0x100 dosyscall64+0x93/0x180 entrySYSCALL64afterhwframe+0x76/0x7e
Freed by task 46: kasansavestack+0x1e/0x40 kasansavetrack+0x10/0x30 kasansavefreeinfo+0x37/0x60 _kasanslabfree+0x4b/0x70 kmemcachefree+0x1a1/0x590 _skdestruct+0x388/0x5a0 skpsockdestroy+0x73e/0xa50 processonework+0x846/0x1420 workerthread+0x5b3/0xf80 kthread+0x29e/0x360 retfromfork+0x2d/0x70 retfromforkasm+0x1a/0x30
The bu ---truncated---