Import Source
https://github.com/microsoft/AzureLinuxVulnerabilityData/blob/main/osv/AZL-51980.json
JSON Data
https://api.test.osv.dev/v1/vulns/AZL-51980
Upstream
Published
2024-08-17T10:15:09Z
Modified
2026-04-01T05:16:27.874620Z
Severity
  • 5.5 (Medium) CVSS_V3 - CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H CVSS Calculator
Summary
CVE-2024-43834 affecting package kernel for versions less than 5.15.176.3-1
Details

In the Linux kernel, the following vulnerability has been resolved:

xdp: fix invalid wait context of pagepooldestroy()

If the driver uses a page pool, it creates a page pool with pagepoolcreate(). The reference count of page pool is 1 as default. A page pool will be destroyed only when a reference count reaches 0. pagepooldestroy() is used to destroy page pool, it decreases a reference count. When a page pool is destroyed, ->disconnect() is called, which is memallocatordisconnect(). This function internally acquires mutex_lock().

If the driver uses XDP, it registers a memory model with xdprxqinforegmemmodel(). The xdprxqinforegmemmodel() internally increases a page pool reference count if a memory model is a page pool. Now the reference count is 2.

To destroy a page pool, the driver should call both pagepooldestroy() and xdpunregmemmodel(). The xdpunregmemmodel() internally calls pagepooldestroy(). Only pagepooldestroy() decreases a reference count.

If a driver calls pagepooldestroy() then xdpunregmemmodel(), we will face an invalid wait context warning. Because xdpunregmemmodel() calls pagepooldestroy() with rcureadlock(). The pagepooldestroy() internally acquires mutex_lock().

Splat looks like:

[ BUG: Invalid wait context ]

6.10.0-rc6+ #4 Tainted: G W

ethtool/1806 is trying to lock: ffffffff90387b90 (memidlock){+.+.}-{4:4}, at: memallocatordisconnect+0x73/0x150 other info that might help us debug this: context-{5:5} 3 locks held by ethtool/1806: stack backtrace: CPU: 0 PID: 1806 Comm: ethtool Tainted: G W 6.10.0-rc6+ #4 f916f41f172891c800f2fed Hardware name: ASUS System Product Name/PRIME Z690-P D4, BIOS 0603 11/01/2021 Call Trace: <TASK> dumpstacklvl+0x7e/0xc0 __lock_acquire+0x1681/0x4de0 ? _printk+0x64/0xe0 ? __pfxmarkpfxmarklock.part.0+0x10/0x10 ? pfxockacquire+0x10/0x10 lockacquire+0x1b3/0x580 ? memallocatordisconnect+0x73/0x150 ? __wakeupklogd.part.0+0x16/0xc0 ? __pfxlockacquire+0x10/0x10 ? dumpstacklvl+0x91/0xc0 __mutexlock+0x15c/0x1690 ? memallocator_disconnect+0x73/0x150 ? __pfxprbreadvalid+0x10/0x10 ? memallocator_disconnect+0x73/0x150 ? __pfxllistaddbatch+0x10/0x10 ? consoleunlock+0x193/0x1b0 ? lockdephardirqson+0xbe/0x140 ? pfxmutexlock+0x10/0x10 ? ticknohztick_stopped+0x16/0x90 ? __irqworkqueuelocal+0x1e5/0x330 ? irqwork_queue+0x39/0x50 ? __wakeupklogd.part.0+0x79/0xc0 ? memallocatordisconnect+0x73/0x150 memallocatordisconnect+0x73/0x150 ? __pfxmemallocatordisconnect+0x10/0x10 ? markheldlocks+0xa5/0xf0 ? rcuiswatching+0x11/0xb0 pagepoolrelease+0x36e/0x6d0 pagepooldestroy+0xd7/0x440 xdpunregmemmodel+0x1a7/0x2a0 ? __pfxxdpunregmemmodel+0x10/0x10 ? kfree+0x125/0x370 ? bnxtfreering.isra.0+0x2eb/0x500 ? bnxtfreemem+0x5ac/0x2500 xdprxqinfounreg+0x4a/0xd0 bnxtfreemem+0x1356/0x2500 bnxtclose_nic+0xf0/0x3b0 ? __pfxbnxtclosenic+0x10/0x10 ? ethnlparsebit+0x2c6/0x6d0 ? pfxnlavalidateparse+0x10/0x10 ? __pfxethnlparse_bit+0x10/0x10 bnxtsetfeatures+0x2a8/0x3e0 __netdevupdatefeatures+0x4dc/0x1370 ? ethnlparsebitset+0x4ff/0x750 ? __pfxethnlbitset+0x4ff/0x750 ? __pfxethnlparse_bitset+0x10/0x10 ? pfxnetdevupdatefeatures+0x10/0x10 ? markheldlocks+0xa5/0xf0 ? rawspinunlockirqrestore+0x42/0x70 ? __pmruntimeresume+0x7d/0x110 ethnlsetfeatures+0x32d/0xa20

To fix this problem, it uses rhashtablelookupfast() instead of rhashtablelookup() with rcureadlock(). Using xa without rcuread_lock() here is safe. xa is freed by _xdpmemallocatorrcufree() and this is called by callrcu() of memxaremove(). The memxaremove() is called by pagepooldestroy() if a reference count reaches 0. The xa is already protected by the reference count mechanism well in the control plane. So removing rcureadlock() for pagepooldestroy() is safe.

References

Affected packages

Azure Linux:2 / kernel

Package

Name
kernel
Purl
pkg:rpm/azure-linux/kernel

Affected ranges

Type
ECOSYSTEM
Events
Introduced
0Unknown introduced version / All previous versions are affected
Fixed
5.15.176.3-1

Database specific

source
"https://github.com/microsoft/AzureLinuxVulnerabilityData/blob/main/osv/AZL-51980.json"