In the Linux kernel, the following vulnerability has been resolved:
ceph: fix i_nlink underrun during async unlink
During async unlink, we drop the i_nlink counter before we receive
the completion (that will eventually update the i_nlink) because "we
assume that the unlink will succeed". That is not a bad idea, but it
races against deletions by other clients (or against the completion of
our own unlink) and can lead to an underrun which emits a WARNING like
this one:
WARNING: CPU: 85 PID: 25093 at fs/inode.c:407 dropnlink+0x50/0x68 Modules linked in: CPU: 85 UID: 3221252029 PID: 25093 Comm: php-cgi8.1 Not tainted 6.14.11-cm4all1-ampere #655 Hardware name: Supermicro ARS-110M-NR/R12SPD-A, BIOS 1.1b 10/17/2023 pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : dropnlink+0x50/0x68 lr : cephunlink+0x6c4/0x720 sp : ffff80012173bc90 x29: ffff80012173bc90 x28: ffff086d0a45aaf8 x27: ffff0871d0eb5680 x26: ffff087f2a64a718 x25: 0000020000000180 x24: 0000000061c88647 x23: 0000000000000002 x22: ffff07ff9236d800 x21: 0000000000001203 x20: ffff07ff9237b000 x19: ffff088b8296afc0 x18: 00000000f3c93365 x17: 0000000000070000 x16: ffff08faffcbdfe8 x15: ffff08faffcbdfec x14: 0000000000000000 x13: 45445f65645f3037 x12: 34385f6369706f74 x11: 0000a2653104bb20 x10: ffffd85f26d73290 x9 : ffffd85f25664f94 x8 : 00000000000000c0 x7 : 0000000000000000 x6 : 0000000000000002 x5 : 0000000000000081 x4 : 0000000000000481 x3 : 0000000000000000 x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff08727d3f91e8 Call trace: dropnlink+0x50/0x68 (P) vfsunlink+0xb0/0x2e8 dounlinkat+0x204/0x288 _arm64sysunlinkat+0x3c/0x80 invokesyscall.constprop.0+0x54/0xe8 doel0svc+0xa4/0xc8 el0svc+0x18/0x58 el0t64synchandler+0x104/0x130 el0t64sync+0x154/0x158
In cephunlink(), a call to cephmdscsubmitrequest() submits the CEPHMDSOP_UNLINK to the MDS, but does not wait for completion.
Meanwhile, between this call and the following dropnlink() call, a
worker thread may process a CEPHCAPOPIMPORT, CEPHCAPOPGRANT or
just a CEPHMSGCLIENTREPLY (the latter of which could be our own
completion). These will lead to a setnlink() call, updating the
i_nlink counter to the value received from the MDS. If that new
i_nlink value happens to be zero, it is illegal to decrement it
further. But that is exactly what cephunlink() will do then.
The WARNING can be reproduced this way:
Force async unlink; only the async code path is affected. Having no real clue about Ceph internals, I was unable to find out why the MDS wouldn't give me the "Fxr" capabilities, so I patched getcapsforasyncunlink() to always succeed.
(Note that the WARNING dump above was found on an unpatched kernel, without this kludge - this is not a theoretical bug.)
Add a sleep call after cephmdscsubmitrequest() so the unlink
completion gets handled by a worker thread before dropnlink() is
called. This guarantees that the i_nlink is already zero before
drop_nlink() runs.
The solution is to skip the counter decrement when it is already zero,
but doing so without a lock is still racy (TOCTOU). Since
cephfillinode() and handlecapgrant() both hold the
ceph_inode_info.i_ceph_lock spinlock while set_nlink() runs, this
seems like the proper lock to protect the i_nlink updates.
I found prior art in NFS and SMB (using inode.i_lock) and AFS (using
afs_vnode.cb_lock). All three have the zero check as well.
{
"cna_assigner": "Linux",
"osv_generated_from": "https://github.com/CVEProject/cvelistV5/tree/main/cves/2026/43xxx/CVE-2026-43420.json"
}