OESA-2024-1896

Source
https://www.openeuler.org/en/security/security-bulletins/detail/?id=openEuler-SA-2024-1896
Import Source
https://repo.openeuler.org/security/data/osv/OESA-2024-1896.json
JSON Data
https://api.test.osv.dev/v1/vulns/OESA-2024-1896
Upstream
Published
2024-07-26T11:08:37Z
Modified
2025-08-12T05:36:02.324129Z
Summary
kernel security update
Details

The Linux Kernel, the operating system core itself.

Security Fix(es):

In the Linux kernel, the following vulnerability has been resolved:

lib/generic-radix-tree.c: Don't overflow in peek()

When we started spreading new inode numbers throughout most of the 64 bit inode space, that triggered some corner case bugs, in particular some integer overflows related to the radix tree code. Oops.(CVE-2021-47432)

In the Linux kernel, the following vulnerability has been resolved:

scsi: ufs: Fix a deadlock in the error handler

The following deadlock has been observed on a test setup:

  • All tags allocated

  • The SCSI error handler calls ufshcdehhostresethandler()

  • ufshcdehhostresethandler() queues work that calls ufshcderrhandler()

  • ufshcderrhandler() locks up as follows:

Workqueue: ufsehwq0 ufshcderrhandler.cfijt Call trace: _switchto+0x298/0x5d8 _schedule+0x6cc/0xa94 schedule+0x12c/0x298 blkmqgettag+0x210/0x480 _blkmqallocrequest+0x1c8/0x284 blkgetrequest+0x74/0x134 ufshcdexecdevcmd+0x68/0x640 ufshcdverifydevinit+0x68/0x35c ufshcdprobehba+0x12c/0x1cb8 ufshcdhostresetandrestore+0x88/0x254 ufshcdresetandrestore+0xd0/0x354 ufshcderrhandler+0x408/0xc58 processonework+0x24c/0x66c workerthread+0x3e8/0xa4c kthread+0x150/0x1b4 retfromfork+0x10/0x30

Fix this lockup by making ufshcdexecdev_cmd() allocate a reserved request.(CVE-2021-47622)

In the Linux kernel, the following vulnerability has been resolved:

net: dsa: seville: register the mdiobus under devres

As explained in commits: 74b6d7d13307 ("net: dsa: realtek: register the MDIO bus under devres") 5135e96a3dd2 ("net: dsa: don't allocate the slavemiibus using devres")

mdiobusfree() will panic when called from devmmdiobusfree() <- devresreleaseall() <- _devicereleasedriver(), and that mdiobus was not previously unregistered.

The Seville VSC9959 switch is a platform device, so the initial set of constraints that I thought would cause this (I2C or SPI buses which call ->remove on ->shutdown) do not apply. But there is one more which applies here.

If the DSA master itself is on a bus that calls ->remove from ->shutdown (like dpaa2-eth, which is on the fsl-mc bus), there is a device link between the switch and the DSA master, and devicelinksunbind_consumers() will unbind the seville switch driver on shutdown.

So the same treatment must be applied to all DSA switch drivers, which is: either use devres for both the mdiobus allocation and registration, or don't use devres at all.

The seville driver has a code structure that could accommodate both the mdiobusunregister and mdiobusfree calls, but it has an external dependency upon msccmiimsetup() from mdio-mscc-miim.c, which calls devmmdiobusallocsize() on its behalf. So rather than restructuring that, and exporting yet one more symbol msccmiimteardown(), let's work with devres and replace ofmdiobus_register with the devres variant. When we use all-devres, we can ensure that devres doesn't free a still-registered bus (it either runs both callbacks, or none).(CVE-2022-48814)

In the Linux kernel, the following vulnerability has been resolved:

SUNRPC: lock against ->sock changing during sysfs read

->sock can be set to NULL asynchronously unless ->recv_mutex is held. So it is important to hold that mutex. Otherwise a sysfs read can trigger an oops. Commit 17f09d3f619a ("SUNRPC: Check if the xprt is connected before handling sysfs reads") appears to attempt to fix this problem, but it only narrows the race window.(CVE-2022-48816)

In the Linux kernel, the following vulnerability has been resolved:

Bluetooth: hcicore: Fix leaking sentcmd skb

sentcmd memory is not freed before freeing hcidev causing it to leak it contents.(CVE-2022-48844)

In the Linux kernel, the following vulnerability has been resolved:

smb: client: fix potential deadlock when releasing mids

All releasemid() callers seem to hold a reference of @mid so there is no need to call krefput(&mid->refcount, _releasemid) under @server->mid_lock spinlock. If they don't, then an use-after-free bug would have occurred anyways.

By getting rid of such spinlock also fixes a potential deadlock as shown below

CPU 0 CPU 1

cifsdemultiplexthread() cifsdebugdataprocshow() releasemid() spinlock(&server->midlock); spinlock(&cifstcpseslock) spinlock(&server->midlock) _releasemid() smb2findsmbtcon() spinlock(&cifstcpseslock) deadlock(CVE-2023-52757)

In the Linux kernel, the following vulnerability has been resolved:

usb: config: fix iteration issue in 'usbgetbos_descriptor()'

The BOS descriptor defines a root descriptor and is the base descriptor for accessing a family of related descriptors.

Function 'usbgetbosdescriptor()' encounters an iteration issue when skipping the 'USBDTDEVICECAPABILITY' descriptor type. This results in the same descriptor being read repeatedly.

To address this issue, a 'goto' statement is introduced to ensure that the pointer and the amount read is updated correctly. This ensures that the function iterates to the next descriptor instead of reading the same descriptor repeatedly.(CVE-2023-52781)

In the Linux kernel, the following vulnerability has been resolved:

nfs: Handle error of rpcprocregister() in nfsnetinit().

syzkaller reported a warning [0] triggered while destroying immature netns.

rpcprocregister() was called in initnfsfs(), but its error has been ignored since at least the initial commit 1da177e4c3f4 ("Linux-2.6.12-rc2").

Recently, commit d47151b79e32 ("nfs: expose /proc/net/sunrpc/nfs in net namespaces") converted the procfs to per-netns and made the problem more visible.

Even when rpcprocregister() fails, nfsnetinit() could succeed, and thus nfsnetexit() will be called while destroying the netns.

Then, removeprocentry() will be called for non-existing proc directory and trigger the warning below.

Let's handle the error of rpcprocregister() properly in nfsnetinit().

WARNING: CPU: 1 PID: 1710 at fs/proc/generic.c:711 removeprocentry+0x1bb/0x2d0 fs/proc/generic.c:711 Modules linked in: CPU: 1 PID: 1710 Comm: syz-executor.2 Not tainted 6.8.0-12822-gcd51db110a7e #12 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:removeprocentry+0x1bb/0x2d0 fs/proc/generic.c:711 Code: 41 5d 41 5e c3 e8 85 09 b5 ff 48 c7 c7 88 58 64 86 e8 09 0e 71 02 e8 74 09 b5 ff 4c 89 e6 48 c7 c7 de 1b 80 84 e8 c5 ad 97 ff <0f> 0b eb b1 e8 5c 09 b5 ff 48 c7 c7 88 58 64 86 e8 e0 0d 71 02 eb RSP: 0018:ffffc9000c6d7ce0 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffff8880422b8b00 RCX: ffffffff8110503c RDX: ffff888030652f00 RSI: ffffffff81105045 RDI: 0000000000000001 RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000001 R11: ffffffff81bb62cb R12: ffffffff84807ffc R13: ffff88804ad6fcc0 R14: ffffffff84807ffc R15: ffffffff85741ff8 FS: 00007f30cfba8640(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ff51afe8000 CR3: 000000005a60a005 CR4: 0000000000770ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> rpcprocunregister+0x64/0x70 net/sunrpc/stats.c:310 nfsnetexit+0x1c/0x30 fs/nfs/inode.c:2438 opsexitlist+0x62/0xb0 net/core/netnamespace.c:170 setupnet+0x46c/0x660 net/core/netnamespace.c:372 copynetns+0x244/0x590 net/core/netnamespace.c:505 createnewnamespaces+0x2ed/0x770 kernel/nsproxy.c:110 unsharensproxynamespaces+0xae/0x160 kernel/nsproxy.c:228 ksysunshare+0x342/0x760 kernel/fork.c:3322 _dosysunshare kernel/fork.c:3393 [inline] _sesysunshare kernel/fork.c:3391 [inline] _x64sysunshare+0x1f/0x30 kernel/fork.c:3391 dosyscallx64 arch/x86/entry/common.c:52 [inline] dosyscall64+0x4f/0x110 arch/x86/entry/common.c:83 entrySYSCALL64afterhwframe+0x46/0x4e RIP: 0033:0x7f30d0febe5d Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 73 9f 1b 00 f7 d8 64 89 01 48 RSP: 002b:00007f30cfba7cc8 EFLAGS: 00000246 ORIG_RAX: 0000000000000110 RAX: ffffffffffffffda RBX: 00000000004bbf80 RCX: 00007f30d0febe5d RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000006c020600 RBP: 00000000004bbf80 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000002 R13: 000000000000000b R14: 00007f30d104c530 R15: 0000000000000000 </TASK>(CVE-2024-36939)

In the Linux kernel, the following vulnerability has been resolved:

scsi: qedf: Ensure the copied buf is NUL terminated

Currently, we allocate a count-sized kernel buffer and copy count from userspace to that buffer. Later, we use kstrtouint on this buffer but we don't ensure that the string is terminated inside the buffer, this can lead to OOB read when using kstrtouint. Fix this issue by using memdupusernul instead of memdup_user.(CVE-2024-38559)

In the Linux kernel, the following vulnerability has been resolved:

drivers/perf: hisi: hns3: Fix out-of-bound access when valid event group

The perf tool allows users to create event groups through following cmd [1], but the driver does not check whether the array index is out of bounds when writing data to the eventgroup array. If the number of events in an eventgroup is greater than HNS3PMUMAXHWEVENTS, the memory write overflow of event_group array occurs.

Add array index check to fix the possible array out of bounds violation, and return directly when write new events are written to array bounds.

There are 9 different events in an event_group. [1] perf stat -e '{pmu/event1/, ... ,pmu/event9/}(CVE-2024-38568)

In the Linux kernel, the following vulnerability has been resolved:

ecryptfs: Fix buffer size for tag 66 packet

The 'TAG 66 Packet Format' description is missing the cipher code and checksum fields that are packed into the message packet. As a result, the buffer allocated for the packet is 3 bytes too small and writetag66_packet() will write up to 3 bytes past the end of the buffer.

Fix this by increasing the size of the allocation so the whole packet will always fit in the buffer.

This fixes the below kasan slab-out-of-bounds bug:

BUG: KASAN: slab-out-of-bounds in ecryptfsgeneratekeypacketset+0x7d6/0xde0 Write of size 1 at addr ffff88800afbb2a5 by task touch/181

CPU: 0 PID: 181 Comm: touch Not tainted 6.6.13-gnu #1 4c9534092be820851bb687b82d1f92a426598dc6 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2/GNU Guix 04/01/2014 Call Trace: <TASK> dumpstacklvl+0x4c/0x70 printreport+0xc5/0x610 ? ecryptfsgeneratekeypacketset+0x7d6/0xde0 ? kasancompletemodereportinfo+0x44/0x210 ? ecryptfsgeneratekeypacketset+0x7d6/0xde0 kasanreport+0xc2/0x110 ? ecryptfsgeneratekeypacketset+0x7d6/0xde0 asanstore1+0x62/0x80 ecryptfsgeneratekeypacketset+0x7d6/0xde0 ? _pfxecryptfsgeneratekeypacketset+0x10/0x10 ? _allocpages+0x2e2/0x540 ? _pfxovlopen+0x10/0x10 [overlay 30837f11141636a8e1793533a02e6e2e885dad1d] ? dentryopen+0x8f/0xd0 ecryptfswritemetadata+0x30a/0x550 ? _pfxecryptfswritemetadata+0x10/0x10 ? ecryptfsgetlowerfile+0x6b/0x190 ecryptfsinitializefile+0x77/0x150 ecryptfscreate+0x1c2/0x2f0 pathopenat+0x17cf/0x1ba0 ? _pfxpathopenat+0x10/0x10 dofilpopen+0x15e/0x290 ? _pfxdofilpopen+0x10/0x10 ? _kasancheckwrite+0x18/0x30 ? rawspinlock+0x86/0xf0 ? _pfxrawspinlock+0x10/0x10 ? kasancheckwrite+0x18/0x30 ? allocfd+0xf4/0x330 dosysopenat2+0x122/0x160 ? _pfxdosysopenat2+0x10/0x10 _x64sysopenat+0xef/0x170 ? _pfxx64sysopenat+0x10/0x10 dosyscall64+0x60/0xd0 entrySYSCALL64afterhwframe+0x6e/0xd8 RIP: 0033:0x7f00a703fd67 Code: 25 00 00 41 00 3d 00 00 41 00 74 37 64 8b 04 25 18 00 00 00 85 c0 75 5b 44 89 e2 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 85 00 00 00 48 83 c4 68 5d 41 5c c3 0f 1f RSP: 002b:00007ffc088e30b0 EFLAGS: 00000246 ORIG_RAX: 0000000000000101 RAX: ffffffffffffffda RBX: 00007ffc088e3368 RCX: 00007f00a703fd67 RDX: 0000000000000941 RSI: 00007ffc088e48d7 RDI: 00000000ffffff9c RBP: 00007ffc088e48d7 R08: 0000000000000001 R09: 0000000000000000 R10: 00000000000001b6 R11: 0000000000000246 R12: 0000000000000941 R13: 0000000000000000 R14: 00007ffc088e48d7 R15: 00007f00a7180040 </TASK>

Allocated by task 181: kasansavestack+0x2f/0x60 kasansettrack+0x29/0x40 kasansaveallocinfo+0x25/0x40 _kasankmalloc+0xc5/0xd0 _kmalloc+0x66/0x160 ecryptfsgeneratekeypacketset+0x6d2/0xde0 ecryptfswritemetadata+0x30a/0x550 ecryptfsinitializefile+0x77/0x150 ecryptfscreate+0x1c2/0x2f0 pathopenat+0x17cf/0x1ba0 dofilpopen+0x15e/0x290 dosysopenat2+0x122/0x160 _x64sysopenat+0xef/0x170 dosyscall64+0x60/0xd0 entrySYSCALL64after_hwframe+0x6e/0xd8(CVE-2024-38578)

In the Linux kernel, the following vulnerability has been resolved:

netrom: fix possible dead-lock in nrrtioctl()

syzbot loves netrom, and found a possible deadlock in nrrtioctl [1]

Make sure we always acquire nrnodelistlock before nrnodelock(nrnode)

[1] WARNING: possible circular locking dependency detected

6.9.0-rc7-syzkaller-02147-g654de42f3fc6 #0 Not tainted

syz-executor350/5129 is trying to acquire lock: ffff8880186e2070 (&nrnode->nodelock){+...}-{2:2}, at: spinlockbh include/linux/spinlock.h:356 [inline] ffff8880186e2070 (&nrnode->nodelock){+...}-{2:2}, at: nrnodelock include/net/netrom.h:152 [inline] ffff8880186e2070 (&nrnode->nodelock){+...}-{2:2}, at: nrdecobs net/netrom/nrroute.c:464 [inline] ffff8880186e2070 (&nrnode->nodelock){+...}-{2:2}, at: nrrtioctl+0x1bb/0x1090 net/netrom/nrroute.c:697

but task is already holding lock: ffffffff8f7053b8 (nrnodelistlock){+...}-{2:2}, at: spinlockbh include/linux/spinlock.h:356 [inline] ffffffff8f7053b8 (nrnodelistlock){+...}-{2:2}, at: nrdecobs net/netrom/nrroute.c:462 [inline] ffffffff8f7053b8 (nrnodelistlock){+...}-{2:2}, at: nrrtioctl+0x10a/0x1090 net/netrom/nr_route.c:697

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (nrnodelistlock){+...}-{2:2}: lockacquire+0x1ed/0x550 kernel/locking/lockdep.c:5754 _rawspinlockbh include/linux/spinlockapismp.h:126 [inline] rawspinlockbh+0x35/0x50 kernel/locking/spinlock.c:178 spinlockbh include/linux/spinlock.h:356 [inline] nrremovenode net/netrom/nrroute.c:299 [inline] nrdelnode+0x4b4/0x820 net/netrom/nrroute.c:355 nrrtioctl+0xa95/0x1090 net/netrom/nrroute.c:683 sockdoioctl+0x158/0x460 net/socket.c:1222 sockioctl+0x629/0x8e0 net/socket.c:1341 vfsioctl fs/ioctl.c:51 [inline] _dosysioctl fs/ioctl.c:904 [inline] _sesysioctl+0xfc/0x170 fs/ioctl.c:890 dosyscallx64 arch/x86/entry/common.c:52 [inline] dosyscall64+0xf5/0x240 arch/x86/entry/common.c:83 entrySYSCALL64after_hwframe+0x77/0x7f

-> #0 (&nrnode->nodelock){+...}-{2:2}: checkprevadd kernel/locking/lockdep.c:3134 [inline] checkprevsadd kernel/locking/lockdep.c:3253 [inline] validatechain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869 _lockacquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137 lockacquire+0x1ed/0x550 kernel/locking/lockdep.c:5754 _rawspinlockbh include/linux/spinlockapismp.h:126 [inline] rawspinlockbh+0x35/0x50 kernel/locking/spinlock.c:178 spinlockbh include/linux/spinlock.h:356 [inline] nrnodelock include/net/netrom.h:152 [inline] nrdecobs net/netrom/nrroute.c:464 [inline] nrrtioctl+0x1bb/0x1090 net/netrom/nrroute.c:697 sockdoioctl+0x158/0x460 net/socket.c:1222 sockioctl+0x629/0x8e0 net/socket.c:1341 vfsioctl fs/ioctl.c:51 [inline] _dosysioctl fs/ioctl.c:904 [inline] _sesysioctl+0xfc/0x170 fs/ioctl.c:890 dosyscallx64 arch/x86/entry/common.c:52 [inline] dosyscall64+0xf5/0x240 arch/x86/entry/common.c:83 entrySYSCALL64afterhwframe+0x77/0x7f

other info that might help us debug this:

Possible unsafe locking scenario:

   CPU0                    CPU1
   ----                    ----

lock(nrnodelistlock); lock(&nrnode->nodelock); lock(nrnodelistlock); lock(&nrnode->nodelock);

* DEADLOCK *

1 lock held by syz-executor350/5129: #0: ffffffff8f7053b8 (nrnodelistlock){+...}-{2:2}, at: spinlockbh include/linux/spinlock.h:356 [inline] #0: ffffffff8f7053b8 (nrnodelistlock){+...}-{2:2}, at: nrdecobs net/netrom/nr_route.c:462 [inline] #0: ffffffff8f70 ---truncated---(CVE-2024-38589)

In the Linux kernel, the following vulnerability has been resolved:

ALSA: timer: Set lower bound of start tick time

Currently ALSA timer doesn't have the lower limit of the start tick time, and it allows a very small size, e.g. 1 tick with 1ns resolution for hrtimer. Such a situation may lead to an unexpected RCU stall, where the callback repeatedly queuing the expire update, as reported by fuzzer.

This patch introduces a sanity check of the timer start tick time, so that the system returns an error when a too small start size is set. As of this patch, the lower limit is hard-coded to 100us, which is small enough but can still work somehow.(CVE-2024-38618)

In the Linux kernel, the following vulnerability has been resolved:

usb-storage: alauda: Check whether the media is initialized

The member "uzonesize" of struct alaudainfo will remain 0 if alaudainitmedia() fails, potentially causing divide errors in alaudareaddata() and alaudawritelba(). - Add a member "mediainitialized" to struct alaudainfo. - Change a condition in alaudacheckmedia() to ensure the first initialization. - Add an error check for the return value of alaudainit_media().(CVE-2024-38619)

In the Linux kernel, the following vulnerability has been resolved:

nilfs2: fix nilfsemptydir() misjudgment and long loop on I/O errors

The error handling in nilfsemptydir() when a directory folio/page read fails is incorrect, as in the old ext2 implementation, and if the folio/page cannot be read or nilfscheckfolio() fails, it will falsely determine the directory as empty and corrupt the file system.

In addition, since nilfsemptydir() does not immediately return on a failed folio/page read, but continues to loop, this can cause a long loop with I/O if i_size of the directory's inode is also corrupted, causing the log writer thread to wait and hang, as reported by syzbot.

Fix these issues by making nilfsemptydir() immediately return a false value (0) if it fails to get a directory folio/page.(CVE-2024-39469)

In the Linux kernel, the following vulnerability has been resolved:

xfs: fix log recovery buffer allocation for the legacy h_size fixup

Commit a70f9fe52daa ("xfs: detect and handle invalid iclog size set by mkfs") added a fixup for incorrect hsize values used for the initial umount record in old xfsprogs versions. Later commit 0c771b99d6c9 ("xfs: clean up calculation of LR header blocks") cleaned up the log reover buffer calculation, but stoped using the fixed up hsize value to size the log recovery buffer, which can lead to an out of bounds access when the incorrect h_size does not come from the old mkfs tool, but a fuzzer.

Fix this by open coding xloglogrechblks and taking the fixed h_size into account for this calculation.(CVE-2024-39472)

In the Linux kernel, the following vulnerability has been resolved:

ima: Fix use-after-free on a dentry's dname.name

->dname.name can change on rename and the earlier value can be freed; there are conditions sufficient to stabilize it (->dlock on dentry, ->dlock on its parent, ->irwsem exclusive on the parent's inode, rename_lock), but none of those are met at any of the sites. Take a stable snapshot of the name instead.(CVE-2024-39494)

In the Linux kernel, the following vulnerability has been resolved:

vmci: prevent speculation leaks by sanitizing event in event_deliver()

Coverity spotted that eventmsg is controlled by user-space, eventmsg->eventdata.event is passed to eventdeliver() and used as an index without sanitization.

This change ensures that the event index is sanitized to mitigate any possibility of speculative information leaks.

This bug was discovered and resolved using Coverity Static Analysis Security Testing (SAST) by Synopsys, Inc.

Only compile tested, no access to HW.(CVE-2024-39499)

In the Linux kernel, the following vulnerability has been resolved:

drm/komeda: check for error-valued pointer

komedapipelineget_state() may return an error-valued pointer, thus check the pointer for negative or null value before dereferencing.(CVE-2024-39505)

In the Linux kernel, the following vulnerability has been resolved:

USB: class: cdc-wdm: Fix CPU lockup caused by excessive log messages

The syzbot fuzzer found that the interrupt-URB completion callback in the cdc-wdm driver was taking too long, and the driver's immediate resubmission of interrupt URBs with -EPROTO status combined with the dummy-hcd emulation to cause a CPU lockup:

cdcwdm 1-1:1.0: nonzero urb status received: -71 cdcwdm 1-1:1.0: wdmintcallback - 0 bytes watchdog: BUG: soft lockup - CPU#0 stuck for 26s! [syz-executor782:6625] CPU#0 Utilization every 4s during lockup: #1: 98% system, 0% softirq, 3% hardirq, 0% idle #2: 98% system, 0% softirq, 3% hardirq, 0% idle #3: 98% system, 0% softirq, 3% hardirq, 0% idle #4: 98% system, 0% softirq, 3% hardirq, 0% idle #5: 98% system, 1% softirq, 3% hardirq, 0% idle Modules linked in: irq event stamp: 73096 hardirqs last enabled at (73095): [<ffff80008037bc00>] consoleemitnextrecord kernel/printk/printk.c:2935 [inline] hardirqs last enabled at (73095): [<ffff80008037bc00>] consoleflushall+0x650/0xb74 kernel/printk/printk.c:2994 hardirqs last disabled at (73096): [<ffff80008af10b00>] _el1irq arch/arm64/kernel/entry-common.c:533 [inline] hardirqs last disabled at (73096): [<ffff80008af10b00>] el1interrupt+0x24/0x68 arch/arm64/kernel/entry-common.c:551 softirqs last enabled at (73048): [<ffff8000801ea530>] softirqhandleend kernel/softirq.c:400 [inline] softirqs last enabled at (73048): [<ffff8000801ea530>] handlesoftirqs+0xa60/0xc34 kernel/softirq.c:582 softirqs last disabled at (73043): [<ffff800080020de8>] _do_softirq+0x14/0x20 kernel/softirq.c:588 CPU: 0 PID: 6625 Comm: syz-executor782 Tainted: G W 6.10.0-rc2-syzkaller-g8867bbd4a056 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024

Testing showed that the problem did not occur if the two error messages -- the first two lines above -- were removed; apparently adding material to the kernel log takes a surprisingly large amount of time.

In any case, the best approach for preventing these lockups and to avoid spamming the log with thousands of error messages per second is to ratelimit the two deverr() calls. Therefore we replace them with deverr_ratelimited().(CVE-2024-40904)

In the Linux kernel, the following vulnerability has been resolved:

ipv6: fix possible race in _fib6droppcpufrom()

syzbot found a race in _fib6droppcpufrom() [1]

If compiler reads more than once (*ppcpurt), second read could read NULL, if another cpu clears the value in rt6getpcpuroute().

Add a READ_ONCE() to prevent this race.

Also add rcureadlock()/rcureadunlock() because we rely on RCU protection while dereferencing pcpu_rt.

[1]

Oops: general protection fault, probably for non-canonical address 0xdffffc0000000012: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000090-0x0000000000000097] CPU: 0 PID: 7543 Comm: kworker/u8:17 Not tainted 6.10.0-rc1-syzkaller-00013-g2bfcfd584ff5 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024 Workqueue: netns cleanupnet RIP: 0010:fib6droppcpufrom.part.0+0x10a/0x370 net/ipv6/ip6fib.c:984 Code: f8 48 c1 e8 03 80 3c 28 00 0f 85 16 02 00 00 4d 8b 3f 4d 85 ff 74 31 e8 74 a7 fa f7 49 8d bf 90 00 00 00 48 89 f8 48 c1 e8 03 <80> 3c 28 00 0f 85 1e 02 00 00 49 8b 87 90 00 00 00 48 8b 0c 24 48 RSP: 0018:ffffc900040df070 EFLAGS: 00010206 RAX: 0000000000000012 RBX: 0000000000000001 RCX: ffffffff89932e16 RDX: ffff888049dd1e00 RSI: ffffffff89932d7c RDI: 0000000000000091 RBP: dffffc0000000000 R08: 0000000000000005 R09: 0000000000000007 R10: 0000000000000001 R11: 0000000000000006 R12: ffff88807fa080b8 R13: fffffbfff1a9a07d R14: ffffed100ff41022 R15: 0000000000000001 FS: 0000000000000000(0000) GS:ffff8880b9200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b32c26000 CR3: 000000005d56e000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> _fib6droppcpufrom net/ipv6/ip6fib.c:966 [inline] fib6droppcpufrom net/ipv6/ip6fib.c:1027 [inline] fib6purgert+0x7f2/0x9f0 net/ipv6/ip6fib.c:1038 fib6delroute net/ipv6/ip6fib.c:1998 [inline] fib6del+0xa70/0x17b0 net/ipv6/ip6fib.c:2043 fib6cleannode+0x426/0x5b0 net/ipv6/ip6fib.c:2205 fib6walkcontinue+0x44f/0x8d0 net/ipv6/ip6fib.c:2127 fib6walk+0x182/0x370 net/ipv6/ip6fib.c:2175 fib6cleantree+0xd7/0x120 net/ipv6/ip6fib.c:2255 _fib6cleanall+0x100/0x2d0 net/ipv6/ip6fib.c:2271 rt6syncdowndev net/ipv6/route.c:4906 [inline] rt6disableip+0x7ed/0xa00 net/ipv6/route.c:4911 addrconfifdown.isra.0+0x117/0x1b40 net/ipv6/addrconf.c:3855 addrconfnotify+0x223/0x19e0 net/ipv6/addrconf.c:3778 notifiercallchain+0xb9/0x410 kernel/notifier.c:93 callnetdevicenotifiersinfo+0xbe/0x140 net/core/dev.c:1992 callnetdevicenotifiersextack net/core/dev.c:2030 [inline] callnetdevicenotifiers net/core/dev.c:2044 [inline] devclosemany+0x333/0x6a0 net/core/dev.c:1585 unregisternetdevicemanynotify+0x46d/0x19f0 net/core/dev.c:11193 unregisternetdevicemany net/core/dev.c:11276 [inline] defaultdeviceexitbatch+0x85b/0xae0 net/core/dev.c:11759 opsexitlist+0x128/0x180 net/core/netnamespace.c:178 cleanupnet+0x5b7/0xbf0 net/core/netnamespace.c:640 processonework+0x9fb/0x1b60 kernel/workqueue.c:3231 processscheduledworks kernel/workqueue.c:3312 [inline] workerthread+0x6c8/0xf70 kernel/workqueue.c:3393 kthread+0x2c1/0x3a0 kernel/kthread.c:389 retfromfork+0x45/0x80 arch/x86/kernel/process.c:147 retfromforkasm+0x1a/0x30 arch/x86/entry/entry64.S:244(CVE-2024-40905)

In the Linux kernel, the following vulnerability has been resolved:

wifi: mac80211: Fix deadlock in ieee80211stapsdeliverwakeup()

The ieee80211stapsdeliverwakeup() function takes sta->pslock to synchronizes with ieee80211txhunicastpsbuf() which is called from softirq context. However using only spinlock() to get sta->pslock in ieee80211stapsdeliverwakeup() does not prevent softirq to execute on this same CPU, to run ieee80211txhunicastps_buf() and try to take this same lock ending in deadlock. Below is an example of rcu stall that arises in such situation.

rcu: INFO: rcusched self-detected stall on CPU rcu: 2-....: (42413413 ticks this GP) idle=b154/1/0x4000000000000000 softirq=1763/1765 fqs=21206996 rcu: (t=42586894 jiffies g=2057 q=362405 ncpus=4) CPU: 2 PID: 719 Comm: wpasupplicant Tainted: G W 6.4.0-02158-g1b062f552873 #742 Hardware name: RPT (r1) (DT) pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : queuedspinlockslowpath+0x58/0x2d0 lr : invoketxhandlersearly+0x5b4/0x5c0 sp : ffff00001ef64660 x29: ffff00001ef64660 x28: ffff000009bc1070 x27: ffff000009bc0ad8 x26: ffff000009bc0900 x25: ffff00001ef647a8 x24: 0000000000000000 x23: ffff000009bc0900 x22: ffff000009bc0900 x21: ffff00000ac0e000 x20: ffff00000a279e00 x19: ffff00001ef646e8 x18: 0000000000000000 x17: ffff800016468000 x16: ffff00001ef608c0 x15: 0010533c93f64f80 x14: 0010395c9faa3946 x13: 0000000000000000 x12: 00000000fa83b2da x11: 000000012edeceea x10: ffff0000010fbe00 x9 : 0000000000895440 x8 : 000000000010533c x7 : ffff00000ad8b740 x6 : ffff00000c350880 x5 : 0000000000000007 x4 : 0000000000000001 x3 : 0000000000000000 x2 : 0000000000000000 x1 : 0000000000000001 x0 : ffff00000ac0e0e8 Call trace: queuedspinlockslowpath+0x58/0x2d0 ieee80211tx+0x80/0x12c ieee80211txpending+0x110/0x278 taskletactioncommon.constprop.0+0x10c/0x144 taskletaction+0x20/0x28 _stext+0x11c/0x284 dosoftirq+0xc/0x14 callonirqstack+0x24/0x34 dosoftirqownstack+0x18/0x20 dosoftirq+0x74/0x7c _localbhenableip+0xa0/0xa4 _ieee80211waketxqs+0x3b0/0x4b8 _ieee80211wakequeue+0x12c/0x168 ieee80211addpendingskbs+0xec/0x138 ieee80211stapsdeliverwakeup+0x2a4/0x480 ieee80211mpsstastatusupdate.part.0+0xd8/0x11c ieee80211mpsstastatusupdate+0x18/0x24 staapplyparameters+0x3bc/0x4c0 ieee80211changestation+0x1b8/0x2dc nl80211setstation+0x444/0x49c genlfamilyrcvmsgdoit.isra.0+0xa4/0xfc genlrcvmsg+0x1b0/0x244 netlinkrcvskb+0x38/0x10c genlrcv+0x34/0x48 netlinkunicast+0x254/0x2bc netlinksendmsg+0x190/0x3b4 syssendmsg+0x1e8/0x218 _syssendmsg+0x68/0x8c _syssendmsg+0x44/0x84 _arm64syssendmsg+0x20/0x28 doel0svc+0x6c/0xe8 el0svc+0x14/0x48 el0t64synchandler+0xb0/0xb4 el0t64_sync+0x14c/0x150

Using spinlockbh()/spinunlockbh() instead prevents softirq to raise on the same CPU that is holding the lock.(CVE-2024-40912)

In the Linux kernel, the following vulnerability has been resolved:

wifi: iwlwifi: mvm: check n_ssids before accessing the ssids

In some versions of cfg80211, the ssids poinet might be a valid one even though nssids is 0. Accessing the pointer in this case will cuase an out-of-bound access. Fix this by checking nssids first.(CVE-2024-40929)

In the Linux kernel, the following vulnerability has been resolved:

drm/exynos/vidi: fix memory leak in .get_modes()

The duplicated EDID is never freed. Fix it.(CVE-2024-40932)

In the Linux kernel, the following vulnerability has been resolved:

wifi: iwlwifi: mvm: don't read past the mfuart notifcation

In case the firmware sends a notification that claims it has more data than it has, we will read past that was allocated for the notification. Remove the print of the buffer, we won't see it by default. If needed, we can see the content with tracing.

This was reported by KFENCE.(CVE-2024-40941)

In the Linux kernel, the following vulnerability has been resolved:

ocfs2: fix races between hole punching and AIO+DIO

After commit "ocfs2: return real error code in ocfs2diowrgetblock", fstests/generic/300 become from always failed to sometimes failed:

======================================================================== [ 473.293420 ] run fstests generic/300

[ 475.296983 ] JBD2: Ignoring recovery information on journal [ 475.302473 ] ocfs2: Mounting device (253,1) on (node local, slot 0) with ordered data mode. [ 494.290998 ] OCFS2: ERROR (device dm-1): ocfs2changeextentflag: Owner 5668 has an extent at cpos 78723 which can no longer be found [ 494.291609 ] On-disk corruption discovered. Please run fsck.ocfs2 once the filesystem is unmounted. [ 494.292018 ] OCFS2: File system is now read-only. [ 494.292224 ] (kworker/19:11,2628,19):ocfs2markextentwritten:5272 ERROR: status = -30 [ 494.292602 ] (kworker/19:11,2628,19):ocfs2dioendiowrite:2374 ERROR: status = -3

fio: io_u error on file /mnt/scratch/racer: Read-only file system: write offset=460849152, buflen=131072

In _blockdevdirectIO, ocfs2diowrgetblock is called to add unwritten extents to a list. extents are also inserted into extent tree in ocfs2writebeginnolock. Then another thread call fallocate to puch a hole at one of the unwritten extent. The extent at cpos was removed by ocfs2removeextent(). At end io worker thread, ocfs2searchextent_list found there is no such extent at the cpos.

T1                        T2                T3
                          inode lock
                            ...
                            insert extents
                            ...
                          inode unlock

ocfs2fallocate _ocfs2changefilespace inode lock lock ipallocsem ocfs2removeinoderange inode ocfs2removebtreerange ocfs2removeextent ^---remove the extent at cpos 78723 ... unlock ipallocsem inode unlock ocfs2dioendio ocfs2dioendiowrite lock ipallocsem ocfs2markextentwritten ocfs2changeextentflag ocfs2searchextentlist ^---failed to find extent ... unlock ipalloc_sem

In most filesystems, fallocate is not compatible with racing with AIO+DIO, so fix it by adding to wait for all dio before fallocate/punch_hole like ext4.(CVE-2024-40943)

In the Linux kernel, the following vulnerability has been resolved:

MIPS: Octeon: Add PCIe link status check

The standard PCIe configuration read-write interface is used to access the configuration space of the peripheral PCIe devices of the mips processor after the PCIe link surprise down, it can generate kernel panic caused by "Data bus error". So it is necessary to add PCIe link status check for system protection. When the PCIe link is down or in training, assigning a value of 0 to the configuration address can prevent read-write behavior to the configuration space of peripheral PCIe devices, thereby preventing kernel panic.(CVE-2024-40968)

In the Linux kernel, the following vulnerability has been resolved:

powerpc/pseries: Enforce hcall result buffer validity and size

plparhcall(), plparhcall9(), and related functions expect callers to provide valid result buffers of certain minimum size. Currently this is communicated only through comments in the code and the compiler has no idea.

For example, if I write a bug like this:

long retbuf[PLPARHCALLBUFSIZE]; // should be PLPARHCALL9BUFSIZE plparhcall9(HALLOCATEVASWINDOW, retbuf, ...);

This compiles with no diagnostics emitted, but likely results in stack corruption at runtime when plpar_hcall9() stores results past the end of the array. (To be clear this is a contrived example and I have not found a real instance yet.)

To make this class of error less likely, we can use explicitly-sized array parameters instead of pointers in the declarations for the hcall APIs. When compiled with -Warray-bounds[1], the code above now provokes a diagnostic like this:

error: array argument is too small; is of size 32, callee requires at least 72 [-Werror,-Warray-bounds] 60 | plparhcall9(HALLOCATEVASWINDOW, retbuf, | ^ ~~~~~~

[1] Enabled for LLVM builds but not GCC for now. See commit 0da6e5fd6c37 ("gcc: disable '-Warray-bounds' for gcc-13 too") and related changes.(CVE-2024-40974)

In the Linux kernel, the following vulnerability has been resolved:

tipc: force a dst refcount before doing decryption

As it says in commit 3bc07321ccc2 ("xfrm: Force a dst refcount before entering the xfrm type handlers"):

"Crypto requests might return asynchronous. In this case we leave the rcu protected region, so force a refcount on the skb's destination entry before we enter the xfrm type input/output handlers."

On TIPC decryption path it has the same problem, and skbdstforce() should be called before doing decryption to avoid a possible crash.

Shuang reported this issue when this warning is triggered:

[] WARNING: include/net/dst.h:337 tipcskrcv+0x1055/0x1ea0 [tipc] [] Kdump: loaded Tainted: G W --------- - - 4.18.0-496.el8.x8664+debug [] Workqueue: crypto cryptdqueueworker [] RIP: 0010:tipcskrcv+0x1055/0x1ea0 [tipc] [] Call Trace: [] tipcskmcastrcv+0x548/0xea0 [tipc] [] tipcrcv+0xcf5/0x1060 [tipc] [] tipcaeaddecryptdone+0x215/0x2e0 [tipc] [] cryptdaeadcrypt+0xdb/0x190 [] cryptdqueueworker+0xed/0x190 [] processonework+0x93d/0x17e0(CVE-2024-40983)

In the Linux kernel, the following vulnerability has been resolved:

ACPICA: Revert "ACPICA: avoid Info: mapping multiple BARs. Your kernel is fine."

Undo the modifications made in commit d410ee5109a1 ("ACPICA: avoid "Info: mapping multiple BARs. Your kernel is fine.""). The initial purpose of this commit was to stop memory mappings for operation regions from overlapping page boundaries, as it can trigger warnings if different page attributes are present.

However, it was found that when this situation arises, mapping continues until the boundary's end, but there is still an attempt to read/write the entire length of the map, leading to a NULL pointer deference. For example, if a four-byte mapping request is made but only one byte is mapped because it hits the current page boundary's end, a four-byte read/write attempt is still made, resulting in a NULL pointer deference.

Instead, map the entire length, as the ACPI specification does not mandate that it must be within the same page boundary. It is permissible for it to be mapped across different regions.(CVE-2024-40984)

In the Linux kernel, the following vulnerability has been resolved:

drm/amdgpu: fix UBSAN warning in kv_dpm.c

Adds bounds check for sumovidmapping_entry.(CVE-2024-40987)

In the Linux kernel, the following vulnerability has been resolved:

tracing: Build event generation tests only as modules

The kprobes and synth event generation test modules add events and lock (get a reference) those event file reference in module init function, and unlock and delete it in module exit function. This is because those are designed for playing as modules.

If we make those modules as built-in, those events are left locked in the kernel, and never be removed. This causes kprobe event self-test failure as below.

[ 97.349708] ------------[ cut here ]------------ [ 97.353453] WARNING: CPU: 3 PID: 1 at kernel/trace/tracekprobe.c:2133 kprobetraceselftestsinit+0x3f1/0x480 [ 97.357106] Modules linked in: [ 97.358488] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 6.9.0-g699646734ab5-dirty #14 [ 97.361556] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 [ 97.363880] RIP: 0010:kprobetraceselftestsinit+0x3f1/0x480 [ 97.365538] Code: a8 24 08 82 e9 ae fd ff ff 90 0f 0b 90 48 c7 c7 e5 aa 0b 82 e9 ee fc ff ff 90 0f 0b 90 48 c7 c7 2d 61 06 82 e9 8e fd ff ff 90 <0f> 0b 90 48 c7 c7 33 0b 0c 82 89 c6 e8 6e 03 1f ff 41 ff c7 e9 90 [ 97.370429] RSP: 0000:ffffc90000013b50 EFLAGS: 00010286 [ 97.371852] RAX: 00000000fffffff0 RBX: ffff888005919c00 RCX: 0000000000000000 [ 97.373829] RDX: ffff888003f40000 RSI: ffffffff8236a598 RDI: ffff888003f40a68 [ 97.375715] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000 [ 97.377675] R10: ffffffff811c9ae5 R11: ffffffff8120c4e0 R12: 0000000000000000 [ 97.379591] R13: 0000000000000001 R14: 0000000000000015 R15: 0000000000000000 [ 97.381536] FS: 0000000000000000(0000) GS:ffff88807dcc0000(0000) knlGS:0000000000000000 [ 97.383813] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 97.385449] CR2: 0000000000000000 CR3: 0000000002244000 CR4: 00000000000006b0 [ 97.387347] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 97.389277] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 97.391196] Call Trace: [ 97.391967] <TASK> [ 97.392647] ? _warn+0xcc/0x180 [ 97.393640] ? kprobetraceselftestsinit+0x3f1/0x480 [ 97.395181] ? reportbug+0xbd/0x150 [ 97.396234] ? handlebug+0x3e/0x60 [ 97.397311] ? excinvalidop+0x1a/0x50 [ 97.398434] ? asmexcinvalidop+0x1a/0x20 [ 97.399652] ? tracekprobeisbusy+0x20/0x20 [ 97.400904] ? tracingresetallonlinecpus+0x15/0x90 [ 97.402304] ? kprobetraceselftestsinit+0x3f1/0x480 [ 97.403773] ? initkprobetrace+0x50/0x50 [ 97.404972] dooneinitcall+0x112/0x240 [ 97.406113] doinitcalllevel+0x95/0xb0 [ 97.407286] ? kernelinit+0x1a/0x1a0 [ 97.408401] doinitcalls+0x3f/0x70 [ 97.409452] kernelinitfreeable+0x16f/0x1e0 [ 97.410662] ? restinit+0x1f0/0x1f0 [ 97.411738] kernelinit+0x1a/0x1a0 [ 97.412788] retfromfork+0x39/0x50 [ 97.413817] ? restinit+0x1f0/0x1f0 [ 97.414844] retfromforkasm+0x11/0x20 [ 97.416285] </TASK> [ 97.417134] irq event stamp: 13437323 [ 97.418376] hardirqs last enabled at (13437337): [<ffffffff8110bc0c>] consoleunlock+0x11c/0x150 [ 97.421285] hardirqs last disabled at (13437370): [<ffffffff8110bbf1>] consoleunlock+0x101/0x150 [ 97.423838] softirqs last enabled at (13437366): [<ffffffff8108e17f>] handlesoftirqs+0x23f/0x2a0 [ 97.426450] softirqs last disabled at (13437393): [<ffffffff8108e346>] _irqexitrcu+0x66/0xd0 [ 97.428850] ---[ end trace 0000000000000000 ]---

And also, since we can not cleanup dynamic_event file, ftracetest are failed too.

To avoid these issues, build these tests only as modules.(CVE-2024-41004)

In the Linux kernel, the following vulnerability has been resolved:

netpoll: Fix race condition in netpollowneractive

KCSAN detected a race condition in netpoll:

BUG: KCSAN: data-race in net_rx_action / netpoll_send_skb
write (marked) to 0xffff8881164168b0 of 4 bytes by interrupt on cpu 10:
net_rx_action (./include/linux/netpoll.h:90 net/core/dev.c:6712 net/core/dev.c:6822)

<snip> read to 0xffff8881164168b0 of 4 bytes by task 1 on cpu 2: netpollsendskb (net/core/netpoll.c:319 net/core/netpoll.c:345 net/core/netpoll.c:393) netpollsendudp (net/core/netpoll.c:?) <snip> value changed: 0x0000000a -> 0xffffffff

This happens because netpollowneractive() needs to check if the current CPU is the owner of the lock, touching napi->pollowner non atomically. The ->pollowner field contains the current CPU holding the lock.

Use an atomic read to check if the poll owner is the current CPU.(CVE-2024-41005)

In the Linux kernel, the following vulnerability has been resolved:

tcp: avoid too many retransmit packets

If a TCP socket is using TCPUSERTIMEOUT, and the other peer retracted its window to zero, tcpretransmittimer() can retransmit a packet every two jiffies (2 ms for HZ=1000), for about 4 minutes after TCPUSERTIMEOUT has 'expired'.

The fix is to make sure tcprtxprobe0timedout() takes icsk->icskusertimeout into account.

Before blamed commit, the socket would not timeout after icsk->icskusertimeout, but would use standard exponential backoff for the retransmits.

Also worth noting that before commit e89688e3e978 ("net: tcp: fix unexcepted socket die when snd_wnd is 0"), the issue would last 2 minutes instead of 4.(CVE-2024-41007)

In the Linux kernel, the following vulnerability has been resolved:

bpf: Fix overrunning reservations in ringbuf

The BPF ring buffer internally is implemented as a power-of-2 sized circular buffer, with two logical and ever-increasing counters: consumerpos is the consumer counter to show which logical position the consumer consumed the data, and producerpos which is the producer counter denoting the amount of data reserved by all producers.

Each time a record is reserved, the producer that "owns" the record will successfully advance producer counter. In user space each time a record is read, the consumer of the data advanced the consumer counter once it finished processing. Both counters are stored in separate pages so that from user space, the producer counter is read-only and the consumer counter is read-write.

One aspect that simplifies and thus speeds up the implementation of both producers and consumers is how the data area is mapped twice contiguously back-to-back in the virtual memory, allowing to not take any special measures for samples that have to wrap around at the end of the circular buffer data area, because the next page after the last data page would be first data page again, and thus the sample will still appear completely contiguous in virtual memory.

Each record has a struct bpfringbufhdr { u32 len; u32 pgoff; } header for book-keeping the length and offset, and is inaccessible to the BPF program. Helpers like bpfringbuf_reserve() return (void *)hdr + BPF_RINGBUF_HDR_SZ for the BPF program to use. Bing-Jhong and Muhammad reported that it is however possible to make a second allocated memory chunk overlapping with the first chunk and as a result, the BPF program is now able to edit first chunk's header.

For example, consider the creation of a BPFMAPTYPERINGBUF map with size of 0x4000. Next, the consumerpos is modified to 0x3000 /before/ a call to bpfringbufreserve() is made. This will allocate a chunk A, which is in [0x0,0x3008], and the BPF program is able to edit [0x8,0x3008]. Now, lets allocate a chunk B with size 0x3000. This will succeed because consumerpos was edited ahead of time to pass the new_prod_pos - cons_pos &gt; rb-&gt;mask check. Chunk B will be in range [0x3008,0x6010], and the BPF program is able to edit [0x3010,0x6010]. Due to the ring buffer memory layout mentioned earlier, the ranges [0x0,0x4000] and [0x4000,0x8000] point to the same data pages. This means that chunk B at [0x4000,0x4008] is chunk A's header. bpfringbufsubmit() / bpfringbufdiscard() use the header's pgoff to then locate the bpfringbuf itself via bpfringbufrestorefromrec(). Once chunk B modified chunk A's header, then bpfringbuf_commit() refers to the wrong page and could cause a crash.

Fix it by calculating the oldest pendingpos and check whether the range from the oldest outstanding record to the newest would span beyond the ring buffer size. If that is the case, then reject the request. We've tested with the ring buffer benchmark in BPF selftests (./benchs/runbench_ringbufs.sh) before/after the fix and while it seems a bit slower on some benchmarks, it is still not significantly enough to matter.(CVE-2024-41009)

Database specific
{
    "severity": "High"
}
References

Affected packages

openEuler:22.03-LTS-SP1 / kernel

Package

Name
kernel
Purl
pkg:rpm/openEuler/kernel&distro=openEuler-22.03-LTS-SP1

Affected ranges

Type
ECOSYSTEM
Events
Introduced
0Unknown introduced version / All previous versions are affected
Fixed
5.10.0-136.86.0.167.oe2203sp1

Ecosystem specific

{
    "aarch64": [
        "kernel-5.10.0-136.86.0.167.oe2203sp1.aarch64.rpm",
        "kernel-debuginfo-5.10.0-136.86.0.167.oe2203sp1.aarch64.rpm",
        "kernel-debugsource-5.10.0-136.86.0.167.oe2203sp1.aarch64.rpm",
        "kernel-devel-5.10.0-136.86.0.167.oe2203sp1.aarch64.rpm",
        "kernel-headers-5.10.0-136.86.0.167.oe2203sp1.aarch64.rpm",
        "kernel-source-5.10.0-136.86.0.167.oe2203sp1.aarch64.rpm",
        "kernel-tools-5.10.0-136.86.0.167.oe2203sp1.aarch64.rpm",
        "kernel-tools-debuginfo-5.10.0-136.86.0.167.oe2203sp1.aarch64.rpm",
        "kernel-tools-devel-5.10.0-136.86.0.167.oe2203sp1.aarch64.rpm",
        "perf-5.10.0-136.86.0.167.oe2203sp1.aarch64.rpm",
        "perf-debuginfo-5.10.0-136.86.0.167.oe2203sp1.aarch64.rpm",
        "python3-perf-5.10.0-136.86.0.167.oe2203sp1.aarch64.rpm",
        "python3-perf-debuginfo-5.10.0-136.86.0.167.oe2203sp1.aarch64.rpm"
    ],
    "x86_64": [
        "kernel-5.10.0-136.86.0.167.oe2203sp1.x86_64.rpm",
        "kernel-debuginfo-5.10.0-136.86.0.167.oe2203sp1.x86_64.rpm",
        "kernel-debugsource-5.10.0-136.86.0.167.oe2203sp1.x86_64.rpm",
        "kernel-devel-5.10.0-136.86.0.167.oe2203sp1.x86_64.rpm",
        "kernel-headers-5.10.0-136.86.0.167.oe2203sp1.x86_64.rpm",
        "kernel-source-5.10.0-136.86.0.167.oe2203sp1.x86_64.rpm",
        "kernel-tools-5.10.0-136.86.0.167.oe2203sp1.x86_64.rpm",
        "kernel-tools-debuginfo-5.10.0-136.86.0.167.oe2203sp1.x86_64.rpm",
        "kernel-tools-devel-5.10.0-136.86.0.167.oe2203sp1.x86_64.rpm",
        "perf-5.10.0-136.86.0.167.oe2203sp1.x86_64.rpm",
        "perf-debuginfo-5.10.0-136.86.0.167.oe2203sp1.x86_64.rpm",
        "python3-perf-5.10.0-136.86.0.167.oe2203sp1.x86_64.rpm",
        "python3-perf-debuginfo-5.10.0-136.86.0.167.oe2203sp1.x86_64.rpm"
    ],
    "src": [
        "kernel-5.10.0-136.86.0.167.oe2203sp1.src.rpm"
    ]
}