The Linux Kernel, the operating system core itself.
Security Fix(es):
In the Linux kernel, the following vulnerability has been resolved:
ipvlan: Dont Use skb->sk in ipvlanprocessv{4,6}_outbound
Raw packet from PFPACKET socket ontop of an IPv6-backed ipvlan device will hit WARNONONCE() in skmcloop() through schdirect_xmit() path.
WARNING: CPU: 2 PID: 0 at net/core/sock.c:775 skmcloop+0x2d/0x70 Modules linked in: schnetem ipvlan rfkill cirrus drmshmemhelper sg drmkmshelper CPU: 2 PID: 0 Comm: swapper/2 Kdump: loaded Not tainted 6.9.0+ #279 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 RIP: 0010:skmcloop+0x2d/0x70 Code: fa 0f 1f 44 00 00 65 0f b7 15 f7 96 a3 4f 31 c0 66 85 d2 75 26 48 85 ff 74 1c RSP: 0018:ffffa9584015cd78 EFLAGS: 00010212 RAX: 0000000000000011 RBX: ffff91e585793e00 RCX: 0000000002c6a001 RDX: 0000000000000000 RSI: 0000000000000040 RDI: ffff91e589c0f000 RBP: ffff91e5855bd100 R08: 0000000000000000 R09: 3d00545216f43d00 R10: ffff91e584fdcc50 R11: 00000060dd8616f4 R12: ffff91e58132d000 R13: ffff91e584fdcc68 R14: ffff91e5869ce800 R15: ffff91e589c0f000 FS: 0000000000000000(0000) GS:ffff91e898100000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f788f7c44c0 CR3: 0000000008e1a000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <IRQ> ? _warn (kernel/panic.c:693) ? skmcloop (net/core/sock.c:760) ? reportbug (lib/bug.c:201 lib/bug.c:219) ? handlebug (arch/x86/kernel/traps.c:239) ? excinvalidop (arch/x86/kernel/traps.c:260 (discriminator 1)) ? asmexcinvalidop (./arch/x86/include/asm/idtentry.h:621) ? skmcloop (net/core/sock.c:760) ip6finishoutput2 (net/ipv6/ip6output.c:83 (discriminator 1)) ? nfhookslow (net/netfilter/core.c:626) ip6finishoutput (net/ipv6/ip6output.c:222) ? _pfxip6finishoutput (net/ipv6/ip6output.c:215) ipvlanxmitmodel3 (drivers/net/ipvlan/ipvlancore.c:602) ipvlan ipvlanstartxmit (drivers/net/ipvlan/ipvlanmain.c:226) ipvlan devhardstartxmit (net/core/dev.c:3594) schdirectxmit (net/sched/schgeneric.c:343) _qdiscrun (net/sched/schgeneric.c:416) nettxaction (net/core/dev.c:5286) handlesoftirqs (kernel/softirq.c:555) _irqexitrcu (kernel/softirq.c:589) sysvecapictimer_interrupt (arch/x86/kernel/apic/apic.c:1043)
The warning triggers as this: packetsendmsg packetsnd //skb->sk is packet sk _devqueuexmit _devxmitskb //q->enqueue is not NULL _qdiscrun schdirectxmit devhardstartxmit ipvlanstartxmit ipvlanxmitmodel3 //l3 mode ipvlanprocessoutbound //vepa flag ipvlanprocessv6outbound ip6localout _ip6finishoutput ip6finishoutput2 //multicast packet skmcloop //sk->skfamily is AFPACKET
Call ip{6}localout() with NULL sk in ipvlan as other tunnels to fix this.(CVE-2024-33621)
In the Linux kernel, the following vulnerability has been resolved:
f2fs: compress: don't allow unaligned truncation on released compress inode
f2fs image may be corrupted after below testcase: - mkfs.f2fs -O extraattr,compression -f /dev/vdb - mount /dev/vdb /mnt/f2fs - touch /mnt/f2fs/file - f2fsio setflags compression /mnt/f2fs/file - dd if=/dev/zero of=/mnt/f2fs/file bs=4k count=4 - f2fsio releasecblocks /mnt/f2fs/file - truncate -s 8192 /mnt/f2fs/file - umount /mnt/f2fs - fsck.f2fs /dev/vdb
[ASSERT] (fsckchkinodeblk:1256) --> ino: 0x5 has iblocks: 0x00000002, but has 0x3 blocks [FSCK] validblockcount matching with CP [Fail] [0x4, 0x5] [FSCK] other corrupted bugs [Fail]
The reason is: partial truncation assume compressed inode has reserved blocks, after partial truncation, valid block count may change w/o .iblocks and .totalvalidblockcount update, result in corruption.
This patch only allow cluster size aligned truncation on released compress inode for fixing.(CVE-2024-33847)
In the Linux kernel, the following vulnerability has been resolved:
dma-mapping: benchmark: fix node id validation
While validating node ids in mapbenchmarkioctl(), nodepossible() may be provided with invalid argument outside of [0,MAXNUMNODES-1] range leading to:
BUG: KASAN: wild-memory-access in mapbenchmarkioctl (kernel/dma/mapbenchmark.c:214) Read of size 8 at addr 1fffffff8ccb6398 by task dmamapbenchma/971 CPU: 7 PID: 971 Comm: dmamapbenchma Not tainted 6.9.0-rc6 #37 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) Call Trace: <TASK> dumpstacklvl (lib/dumpstack.c:117) kasanreport (mm/kasan/report.c:603) kasancheckrange (mm/kasan/generic.c:189) variabletestbit (arch/x86/include/asm/bitops.h:227) [inline] archtestbit (arch/x86/include/asm/bitops.h:239) [inline] _testbit at (include/asm-generic/bitops/instrumented-non-atomic.h:142) [inline] nodestate (include/linux/nodemask.h:423) [inline] mapbenchmarkioctl (kernel/dma/mapbenchmark.c:214) fullproxyunlockedioctl (fs/debugfs/file.c:333) _x64sysioctl (fs/ioctl.c:890) dosyscall64 (arch/x86/entry/common.c:83) entrySYSCALL64afterhwframe (arch/x86/entry/entry_64.S:130)
Compare node ids with sane bounds first. NUMANONODE is considered a special valid case meaning that benchmarking kthreads won't be bound to a cpuset of a given node.
Found by Linux Verification Center (linuxtesting.org).(CVE-2024-34777)
In the Linux kernel, the following vulnerability has been resolved:
netfilter: tproxy: bail out if IP has been disabled on the device
syzbot reports: general protection fault, probably for non-canonical address 0xdffffc0000000003: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f] [..] RIP: 0010:nftproxyladdr4+0xb7/0x340 net/ipv4/netfilter/nftproxyipv4.c:62 Call Trace: nfttproxyevalv4 net/netfilter/nfttproxy.c:56 [inline] nfttproxyeval+0xa9a/0x1a00 net/netfilter/nft_tproxy.c:168
_indevgetrcu() can return NULL, so check for this.(CVE-2024-36270)
In the Linux kernel, the following vulnerability has been resolved:
net/mlx5: Use mlx5ipsecrxstatusdestroy to correctly delete status rules
rxcreate no longer allocates a modifyhdr instance that needs to be cleaned up. The mlx5modifyheader_dealloc call will lead to a NULL pointer dereference. A leak in the rules also previously occurred since there are now two rules populated related to status.
BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: errorcode(0x0000) - not-present page PGD 109907067 P4D 109907067 PUD 116890067 PMD 0 Oops: 0000 [#1] SMP CPU: 1 PID: 484 Comm: ip Not tainted 6.9.0-rc2-rrameshbabu+ #254 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Arch Linux 1.16.3-1-1 04/01/2014 RIP: 0010:mlx5modifyheaderdealloc+0xd/0x70 <snip> Call Trace: <TASK> ? showregs+0x60/0x70 ? _die+0x24/0x70 ? pagefaultoops+0x15f/0x430 ? freetopartiallist.constprop.0+0x79/0x150 ? douseraddrfault+0x2c9/0x5c0 ? excpagefault+0x63/0x110 ? asmexcpagefault+0x27/0x30 ? mlx5modifyheaderdealloc+0xd/0x70 rxcreate+0x374/0x590 rxaddrule+0x3ad/0x500 ? rxaddrule+0x3ad/0x500 ? mlx5cmdexec+0x2c/0x40 ? mlx5createipsecobj+0xd6/0x200 mlx5eaccelipsecfsaddrule+0x31/0xf0 mlx5exfrmaddstate+0x426/0xc00 <snip>(CVE-2024-36281)
In the Linux kernel, the following vulnerability has been resolved:
netfilter: nfnetlinkqueue: acquire rcureadlock() in instancedestroy_rcu()
syzbot reported that nfreinject() could be called without rcuread_lock() :
WARNING: suspicious RCU usage 6.9.0-rc7-syzkaller-02060-g5c1672705a1a #0 Not tainted
net/netfilter/nfnetlinkqueue.c:263 suspicious rcudereference_check() usage!
other info that might help us debug this:
rcuscheduleractive = 2, debuglocks = 1 2 locks held by syz-executor.4/13427: #0: ffffffff8e334f60 (rcucallback){....}-{0:0}, at: rculockacquire include/linux/rcupdate.h:329 [inline] #0: ffffffff8e334f60 (rcucallback){....}-{0:0}, at: rcudobatch kernel/rcu/tree.c:2190 [inline] #0: ffffffff8e334f60 (rcucallback){....}-{0:0}, at: rcucore+0xa86/0x1830 kernel/rcu/tree.c:2471 #1: ffff88801ca92958 (&inst->lock){+.-.}-{2:2}, at: spinlockbh include/linux/spinlock.h:356 [inline] #1: ffff88801ca92958 (&inst->lock){+.-.}-{2:2}, at: nfqnlflush net/netfilter/nfnetlinkqueue.c:405 [inline] #1: ffff88801ca92958 (&inst->lock){+.-.}-{2:2}, at: instancedestroyrcu+0x30/0x220 net/netfilter/nfnetlinkqueue.c:172
stack backtrace: CPU: 0 PID: 13427 Comm: syz-executor.4 Not tainted 6.9.0-rc7-syzkaller-02060-g5c1672705a1a #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024 Call Trace: <IRQ> _dumpstack lib/dumpstack.c:88 [inline] dumpstacklvl+0x241/0x360 lib/dumpstack.c:114 lockdeprcususpicious+0x221/0x340 kernel/locking/lockdep.c:6712 nfreinject net/netfilter/nfnetlinkqueue.c:323 [inline] nfqnlreinject+0x6ec/0x1120 net/netfilter/nfnetlinkqueue.c:397 nfqnlflush net/netfilter/nfnetlinkqueue.c:410 [inline] instancedestroyrcu+0x1ae/0x220 net/netfilter/nfnetlinkqueue.c:172 rcudobatch kernel/rcu/tree.c:2196 [inline] rcucore+0xafd/0x1830 kernel/rcu/tree.c:2471 handlesoftirqs+0x2d6/0x990 kernel/softirq.c:554 _dosoftirq kernel/softirq.c:588 [inline] invokesoftirq kernel/softirq.c:428 [inline] _irqexitrcu+0xf4/0x1c0 kernel/softirq.c:637 irqexitrcu+0x9/0x30 kernel/softirq.c:649 instrsysvecapictimerinterrupt arch/x86/kernel/apic/apic.c:1043 [inline] sysvecapictimerinterrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1043 </IRQ> <TASK>(CVE-2024-36286)
In the Linux kernel, the following vulnerability has been resolved:
net: relax socket state check at accept time.
Christoph reported the following splat:
WARNING: CPU: 1 PID: 772 at net/ipv4/afinet.c:761 inetaccept+0x1f4/0x4a0 Modules linked in: CPU: 1 PID: 772 Comm: syz-executor510 Not tainted 6.9.0-rc7-g7da7119fe22b #56 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 RIP: 0010:inetaccept+0x1f4/0x4a0 net/ipv4/afinet.c:759 Code: 04 38 84 c0 0f 85 87 00 00 00 41 c7 04 24 03 00 00 00 48 83 c4 10 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc e8 ec b7 da fd <0f> 0b e9 7f fe ff ff e8 e0 b7 da fd 0f 0b e9 fe fe ff ff 89 d9 80 RSP: 0018:ffffc90000c2fc58 EFLAGS: 00010293 RAX: ffffffff836bdd14 RBX: 0000000000000000 RCX: ffff888104668000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: dffffc0000000000 R08: ffffffff836bdb89 R09: fffff52000185f64 R10: dffffc0000000000 R11: fffff52000185f64 R12: dffffc0000000000 R13: 1ffff92000185f98 R14: ffff88810754d880 R15: ffff8881007b7800 FS: 000000001c772880(0000) GS:ffff88811b280000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fb9fcf2e178 CR3: 00000001045d2002 CR4: 0000000000770ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> inetaccept+0x138/0x1d0 net/ipv4/afinet.c:786 doaccept+0x435/0x620 net/socket.c:1929 _sysaccept4file net/socket.c:1969 [inline] _sysaccept4+0x9b/0x110 net/socket.c:1999 _dosysaccept net/socket.c:2016 [inline] _sesysaccept net/socket.c:2013 [inline] _x64sysaccept+0x7d/0x90 net/socket.c:2013 dosyscallx64 arch/x86/entry/common.c:52 [inline] dosyscall64+0x58/0x100 arch/x86/entry/common.c:83 entrySYSCALL64afterhwframe+0x76/0x7e RIP: 0033:0x4315f9 Code: fd ff 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 ab b4 fd ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007ffdb26d9c78 EFLAGS: 00000246 ORIGRAX: 000000000000002b RAX: ffffffffffffffda RBX: 0000000000400300 RCX: 00000000004315f9 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000004 RBP: 00000000006e1018 R08: 0000000000400300 R09: 0000000000400300 R10: 0000000000400300 R11: 0000000000000246 R12: 0000000000000000 R13: 000000000040cdf0 R14: 000000000040ce80 R15: 0000000000000055 </TASK>
The reproducer invokes shutdown() before entering the listener status. After commit 94062790aedb ("tcp: defer shutdown(SENDSHUTDOWN) for TCPSYNRECV sockets"), the above causes the child to reach the accept syscall in FINWAIT1 status.
Eric noted we can relax the existing assertion in _inetaccept()(CVE-2024-36484)
In the Linux kernel, the following vulnerability has been resolved:
blk-iocost: do not WARN if iocg was already offlined
In iocgpaydebt(), warn is triggered if 'activelist' is empty, which is intended to confirm iocg is active when it has debt. However, warn can be triggered during a blkcg or disk removal, if iocgwaitqtimerfn() is run at that time:
WARNING: CPU: 0 PID: 2344971 at block/blk-iocost.c:1402 iocgpaydebt+0x14c/0x190 Call trace: iocgpaydebt+0x14c/0x190 iocgkickwaitq+0x438/0x4c0 iocgwaitqtimerfn+0xd8/0x130 _runhrtimer+0x144/0x45c _hrtimerrunqueues+0x16c/0x244 hrtimer_interrupt+0x2cc/0x7b0
The warn in this situation is meaningless. Since this iocg is being removed, the state of the 'activelist' is irrelevant, and 'waitqtimer' is canceled after removing 'activelist' in iocpdfree(), which ensures iocg is freed after iocgwaitqtimerfn() returns.
Therefore, add the check if iocg was already offlined to avoid warn when removing a blkcg or disk.(CVE-2024-36908)
In the Linux kernel, the following vulnerability has been resolved:
drm/amd/display: Skip on writeback when it's not applicable
[WHY] dynamic memory safety error detector (KASAN) catches and generates error messages "BUG: KASAN: slab-out-of-bounds" as writeback connector does not support certain features which are not initialized.
[HOW] Skip them when connector type is DRMMODECONNECTOR_WRITEBACK.(CVE-2024-36914)
In the Linux kernel, the following vulnerability has been resolved:
nsh: Restore skb->{protocol,data,macheader} for outer header in nshgso_segment().
syzbot triggered various splats (see [0] and links) by a crafted GSO packet of VIRTIONETHDRGSOUDP layering the following protocols:
ETHP8021AD + ETHPNSH + ETHPIPV6 + IPPROTO_UDP
NSH can encapsulate IPv4, IPv6, Ethernet, NSH, and MPLS. As the inner protocol can be Ethernet, NSH GSO handler, nshgsosegment(), calls skbmacgso_segment() to invoke inner protocol GSO handlers.
nshgsosegment() does the following for the original skb before calling skbmacgso_segment()
and does the following for the segmented skb
There are two problems in 6-7 and 8-9.
(a) After 6 & 7, skb->data points to the NSH header, so the outer header (ETHP8021AD in this case) is stripped when skb is sent out of netdev.
Also, if NSH is encapsulated by NSH + Ethernet (so NSH-Ethernet-NSH), skbpull() in the first nshgsosegment() will make skb->data point to the middle of the outer NSH or Ethernet header because the Ethernet header is not pulled by the second nshgso_segment().
(b) While restoring skb->{macheader,networkheader} in 8 & 9, nshgsosegment() does not assume that the data in the linear buffer is shifted.
However, udp6ufofragment() could shift the data and change skb->mac_header accordingly as demonstrated by syzbot.
If this happens, even the restored skb->mac_header points to the middle of the outer header.
It seems nshgsosegment() has never worked with outer headers so far.
At the end of nshgsosegment(), the outer header must be restored for the segmented skb, instead of the NSH header.
To do that, let's calculate the outer header position relatively from the inner header and set skb->{data,mac_header,protocol} properly.
BUG: KMSAN: uninit-value in ipvlanxmitmodel3 drivers/net/ipvlan/ipvlancore.c:602 [inline] BUG: KMSAN: uninit-value in ipvlanqueuexmit+0xf44/0x16b0 drivers/net/ipvlan/ipvlancore.c:668 ipvlanprocessoutbound drivers/net/ipvlan/ipvlancore.c:524 [inline] ipvlanxmitmodel3 drivers/net/ipvlan/ipvlancore.c:602 [inline] ipvlanqueuexmit+0xf44/0x16b0 drivers/net/ipvlan/ipvlancore.c:668 ipvlanstartxmit+0x5c/0x1a0 drivers/net/ipvlan/ipvlanmain.c:222 _netdevstartxmit include/linux/netdevice.h:4989 [inline] netdevstartxmit include/linux/netdevice.h:5003 [inline] xmitone net/core/dev.c:3547 [inline] devhardstartxmit+0x244/0xa10 net/core/dev.c:3563 _devqueuexmit+0x33ed/0x51c0 net/core/dev.c:4351 devqueuexmit include/linux/netdevice.h:3171 [inline] packetxmit+0x9c/0x6b0 net/packet/afpacket.c:276 packetsnd net/packet/afpacket.c:3081 [inline] packetsendmsg+0x8aef/0x9f10 net/packet/afpacket.c:3113 socksendmsgnosec net/socket.c:730 [inline] _socksendmsg net/socket.c:745 [inline] _syssendto+0x735/0xa10 net/socket.c:2191 _dosyssendto net/socket.c:2203 [inline] _sesyssendto net/socket.c:2199 [inline] _x64syssendto+0x125/0x1c0 net/socket.c:2199 dosyscallx64 arch/x86/entry/common.c:52 [inline] dosyscall64+0xcf/0x1e0 arch/x86/entry/common.c:83 entrySYSCALL64after_hwframe+0x63/0x6b
Uninit was created at: slabpostallochook mm/slub.c:3819 [inline] slaballocnode mm/slub.c:3860 [inline] _dokmallocnode mm/slub.c:3980 [inline] _kmallocnodetrackcaller+0x705/0x1000 mm/slub.c:4001 kmallocreserve+0x249/0x4a0 net/core/skbuff.c:582 _ ---truncated---(CVE-2024-36933)
In the Linux kernel, the following vulnerability has been resolved:
bpf, skmsg: Fix NULL pointer dereference in skpsockskbingressenqueue
Fix NULL pointer data-races in skpsockskbingressenqueue() which syzbot reported [1].
[1] BUG: KCSAN: data-race in skpsockdrop / skpsockskbingressenqueue
write to 0xffff88814b3278b8 of 8 bytes by task 10724 on cpu 1: skpsockstopverdict net/core/skmsg.c:1257 [inline] skpsockdrop+0x13e/0x1f0 net/core/skmsg.c:843 skpsockput include/linux/skmsg.h:459 [inline] sockmapclose+0x1a7/0x260 net/core/sockmap.c:1648 unixrelease+0x4b/0x80 net/unix/afunix.c:1048 _sockrelease net/socket.c:659 [inline] sockclose+0x68/0x150 net/socket.c:1421 _fput+0x2c1/0x660 fs/filetable.c:422 _fputsync+0x44/0x60 fs/filetable.c:507 _dosysclose fs/open.c:1556 [inline] _sesysclose+0x101/0x1b0 fs/open.c:1541 _x64sysclose+0x1f/0x30 fs/open.c:1541 dosyscall64+0xd3/0x1d0 entrySYSCALL64after_hwframe+0x6d/0x75
read to 0xffff88814b3278b8 of 8 bytes by task 10713 on cpu 0: skpsockdataready include/linux/skmsg.h:464 [inline] skpsockskbingressenqueue+0x32d/0x390 net/core/skmsg.c:555 skpsockskbingressself+0x185/0x1e0 net/core/skmsg.c:606 skpsockverdictapply net/core/skmsg.c:1008 [inline] skpsockverdictrecv+0x3e4/0x4a0 net/core/skmsg.c:1202 unixreadskb net/unix/afunix.c:2546 [inline] unixstreamreadskb+0x9e/0xf0 net/unix/afunix.c:2682 skpsockverdictdataready+0x77/0x220 net/core/skmsg.c:1223 unixstreamsendmsg+0x527/0x860 net/unix/afunix.c:2339 socksendmsgnosec net/socket.c:730 [inline] socksendmsg+0x140/0x180 net/socket.c:745 syssendmsg+0x312/0x410 net/socket.c:2584 _syssendmsg net/socket.c:2638 [inline] _syssendmsg+0x1e9/0x280 net/socket.c:2667 _dosyssendmsg net/socket.c:2676 [inline] _sesyssendmsg net/socket.c:2674 [inline] _x64syssendmsg+0x46/0x50 net/socket.c:2674 dosyscall64+0xd3/0x1d0 entrySYSCALL64after_hwframe+0x6d/0x75
value changed: 0xffffffff83d7feb0 -> 0x0000000000000000
Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 10713 Comm: syz-executor.4 Tainted: G W 6.8.0-syzkaller-08951-gfe46a7dd189e #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/29/2024
Prior to this, commit 4cd12c6065df ("bpf, sockmap: Fix NULL pointer dereference in skpsockverdictdataready()") fixed one NULL pointer similarly due to no protection of saveddataready. Here is another different caller causing the same issue because of the same reason. So we should protect it with skcallbacklock read lock because the writer side in the skpsockdrop() uses "writelockbh(&sk->skcallbacklock);".
To avoid errors that could happen in future, I move those two pairs of lock into the skpsockdata_ready(), which is suggested by John Fastabend.(CVE-2024-36938)
In the Linux kernel, the following vulnerability has been resolved:
qibfs: fix dentry leak
simplerecursiveremoval() drops the pinning references to all positives in subtree. For the cases when its argument has been kept alive by the pinning alone that's exactly the right thing to do, but here the argument comes from dcache lookup, that needs to be balanced by explicit dput().
Fucked-up-by: Al Viro <viro@zeniv.linux.org.uk>(CVE-2024-36947)
In the Linux kernel, the following vulnerability has been resolved:
pinctrl: devicetree: fix refcount leak in pinctrldtto_map()
If we fail to allocate propname buffer, we need to drop the reference count we just took. Because the pinctrldtfree_maps() includes the droping operation, here we call it directly.(CVE-2024-36959)
In the Linux kernel, the following vulnerability has been resolved:
drm/msm/a6xx: Avoid a nullptr dereference when speedbin setting fails
Calling a6xxdestroy() before adrenogpu_init() leads to a null pointer dereference on:
msmgpucleanup() : platformsetdrvdata(gpu->pdev, NULL);
as gpu->pdev is only assigned in:
a6xxgpuinit() |_ adrenogpuinit |_ msmgpuinit()
Instead of relying on handwavy null checks down the cleanup chain, explicitly de-allocate the LLC data and free a6xx_gpu instead.
Patchwork: https://patchwork.freedesktop.org/patch/588919/(CVE-2024-38390)
In the Linux kernel, the following vulnerability has been resolved:
RDMA/cma: Fix kmemleak in rdma_core observed during blktests nvme/rdma use siw
When running blktests nvme/rdma, the following kmemleak issue will appear.
kmemleak: Kernel memory leak detector initialized (mempool available:36041) kmemleak: Automatic memory scanning thread started kmemleak: 2 new suspected memory leaks (see /sys/kernel/debug/kmemleak) kmemleak: 8 new suspected memory leaks (see /sys/kernel/debug/kmemleak) kmemleak: 17 new suspected memory leaks (see /sys/kernel/debug/kmemleak) kmemleak: 4 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
unreferenced object 0xffff88855da53400 (size 192): comm "rdma", pid 10630, jiffies 4296575922 hex dump (first 32 bytes): 37 00 00 00 00 00 00 00 c0 ff ff ff 1f 00 00 00 7............... 10 34 a5 5d 85 88 ff ff 10 34 a5 5d 85 88 ff ff .4.].....4.].... backtrace (crc 47f66721): [<ffffffff911251bd>] kmalloctrace+0x30d/0x3b0 [<ffffffffc2640ff7>] allocgidentry+0x47/0x380 [ibcore] [<ffffffffc2642206>] addmodifygid+0x166/0x930 [ibcore] [<ffffffffc2643468>] ibcacheupdate.part.0+0x6d8/0x910 [ibcore] [<ffffffffc2644e1a>] ibcachesetupone+0x24a/0x350 [ibcore] [<ffffffffc263949e>] ibregisterdevice+0x9e/0x3a0 [ibcore] [<ffffffffc2a3d389>] 0xffffffffc2a3d389 [<ffffffffc2688cd8>] nldevnewlink+0x2b8/0x520 [ibcore] [<ffffffffc2645fe3>] rdmanlrcvmsg+0x2c3/0x520 [ibcore] [<ffffffffc264648c>] rdmanlrcvskb.constprop.0.isra.0+0x23c/0x3a0 [ibcore] [<ffffffff9270e7b5>] netlinkunicast+0x445/0x710 [<ffffffff9270f1f1>] netlinksendmsg+0x761/0xc40 [<ffffffff9249db29>] _syssendto+0x3a9/0x420 [<ffffffff9249dc8c>] _x64syssendto+0xdc/0x1b0 [<ffffffff92db0ad3>] dosyscall64+0x93/0x180 [<ffffffff92e00126>] entrySYSCALL64afterhwframe+0x71/0x79
The root cause: rdmaputgidattr is not called when sgidattr is set to ERR_PTR(-ENODEV).(CVE-2024-38539)
In the Linux kernel, the following vulnerability has been resolved:
lib/testhmm.c: handle srcpfns and dst_pfns allocation failure
The kcalloc() in dmirrordeviceevictchunk() will return null if the physical memory has run out. As a result, if srcpfns or dst_pfns is dereferenced, the null pointer dereference bug will happen.
Moreover, the device is going away. If the kcalloc() fails, the pages mapping a chunk could not be evicted. So add a _GFPNOFAIL flag in kcalloc().
Finally, as there is no need to have physically contiguous memory, Switch kcalloc() to kvcalloc() in order to avoid failing allocations.(CVE-2024-38543)
In the Linux kernel, the following vulnerability has been resolved:
RDMA/rxe: Fix seg fault in rxecompqueue_pkt
In rxecompqueuepkt() an incoming response packet skb is enqueued to the resppkts queue and then a decision is made whether to run the completer task inline or schedule it. Finally the skb is dereferenced to bump a 'hw' performance counter. This is wrong because if the completer task is already running in a separate thread it may have already processed the skb and freed it which can cause a seg fault. This has been observed infrequently in testing at high scale.
This patch fixes this by changing the order of enqueuing the packet until after the counter is accessed.(CVE-2024-38544)
In the Linux kernel, the following vulnerability has been resolved:
drm: vc4: Fix possible null pointer dereference
In vc4hdmiaudioinit() ofget_address() may return NULL which is later dereferenced. Fix this bug by adding NULL check.
Found by Linux Verification Center (linuxtesting.org) with SVACE.(CVE-2024-38546)
In the Linux kernel, the following vulnerability has been resolved:
ASoC: kirkwood: Fix potential NULL dereference
In kirkwooddmahwparams() mvmbusdraminfo() returns NULL if CONFIGPLATORION macro is not defined. Fix this bug by adding NULL check.
Found by Linux Verification Center (linuxtesting.org) with SVACE.(CVE-2024-38550)
In the Linux kernel, the following vulnerability has been resolved:
net/mlx5: Reload only IB representors upon lag disable/enable
On lag disable, the bond IB device along with all of its representors are destroyed, and then the slaves' representors get reloaded.
In case the slave IB representor load fails, the eswitch error flow unloads all representors, including ethernet representors, where the netdevs get detached and removed from lag bond. Such flow is inaccurate as the lag driver is not responsible for loading/unloading ethernet representors. Furthermore, the flow described above begins by holding lag lock to prevent bond changes during disable flow. However, when reaching the ethernet representors detachment from lag, the lag lock is required again, triggering the following deadlock:
Call trace: _switchto+0xf4/0x148 _schedule+0x2c8/0x7d0 schedule+0x50/0xe0 schedulepreemptdisabled+0x18/0x28 _mutexlock.isra.13+0x2b8/0x570 _mutexlockslowpath+0x1c/0x28 mutexlock+0x4c/0x68 mlx5lagremovenetdev+0x3c/0x1a0 [mlx5core] mlx5euplinkrepdisable+0x70/0xa0 [mlx5core] mlx5edetachnetdev+0x6c/0xb0 [mlx5core] mlx5enetdevchangeprofile+0x44/0x138 [mlx5core] mlx5enetdevattachnicprofile+0x28/0x38 [mlx5core] mlx5evportrepunload+0x184/0x1b8 [mlx5core] mlx5eswoffloadsrepload+0xd8/0xe0 [mlx5core] mlx5eswitchreloadreps+0x74/0xd0 [mlx5core] mlx5disablelag+0x130/0x138 [mlx5core] mlx5lagdisablechange+0x6c/0x70 [mlx5core] // hold ldev->lock mlx5devlinkeswitchmodeset+0xc0/0x410 [mlx5core] devlinknlcmdeswitchsetdoit+0xdc/0x180 genlfamilyrcvmsgdoit.isra.17+0xe8/0x138 genlrcvmsg+0xe4/0x220 netlinkrcvskb+0x44/0x108 genlrcv+0x40/0x58 netlinkunicast+0x198/0x268 netlinksendmsg+0x1d4/0x418 socksendmsg+0x54/0x60 _syssendto+0xf4/0x120 _arm64syssendto+0x30/0x40 el0svccommon+0x8c/0x120 doel0svc+0x30/0xa0 el0svc+0x20/0x30 el0synchandler+0x90/0xb8 el0sync+0x160/0x180
Thus, upon lag enable/disable, load and unload only the IB representors of the slaves preventing the deadlock mentioned above.
While at it, refactor the mlx5eswoffloadsrepload() function to have a static helper method for its internal logic, in symmetry with the representor unload design.(CVE-2024-38557)
In the Linux kernel, the following vulnerability has been resolved:
scsi: bfa: Ensure the copied buf is NUL terminated
Currently, we allocate a nbytes-sized kernel buffer and copy nbytes from userspace to that buffer. Later, we use sscanf on this buffer but we don't ensure that the string is terminated inside the buffer, this can lead to OOB read when using sscanf. Fix this issue by using memdupusernul instead of memdup_user.(CVE-2024-38560)
In the Linux kernel, the following vulnerability has been resolved:
kunit: Fix kthread reference
There is a race condition when a kthread finishes after the deadline and before the call to kthread_stop(), which may lead to use after free.(CVE-2024-38561)
In the Linux kernel, the following vulnerability has been resolved:
wifi: ar5523: enable proper endpoint verification
Syzkaller reports [1] hitting a warning about an endpoint in use not having an expected type to it.
Fix the issue by checking for the existence of all proper endpoints with their according types intact.
Sadly, this patch has not been tested on real hardware.
[1] Syzkaller report: ------------[ cut here ]------------ usb 1-1: BOGUS urb xfer, pipe 3 != type 1 WARNING: CPU: 0 PID: 3643 at drivers/usb/core/urb.c:504 usbsubmiturb+0xed6/0x1880 drivers/usb/core/urb.c:504 ... Call Trace: <TASK> ar5523cmd+0x41b/0x780 drivers/net/wireless/ath/ar5523/ar5523.c:275 ar5523cmdread drivers/net/wireless/ath/ar5523/ar5523.c:302 [inline] ar5523hostavailable drivers/net/wireless/ath/ar5523/ar5523.c:1376 [inline] ar5523probe+0x14b0/0x1d10 drivers/net/wireless/ath/ar5523/ar5523.c:1655 usbprobeinterface+0x30f/0x7f0 drivers/usb/core/driver.c:396 calldriverprobe drivers/base/dd.c:560 [inline] reallyprobe+0x249/0xb90 drivers/base/dd.c:639 _driverprobedevice+0x1df/0x4d0 drivers/base/dd.c:778 driverprobedevice+0x4c/0x1a0 drivers/base/dd.c:808 _deviceattachdriver+0x1d4/0x2e0 drivers/base/dd.c:936 busforeachdrv+0x163/0x1e0 drivers/base/bus.c:427 _deviceattach+0x1e4/0x530 drivers/base/dd.c:1008 busprobedevice+0x1e8/0x2a0 drivers/base/bus.c:487 deviceadd+0xbd9/0x1e90 drivers/base/core.c:3517 usbsetconfiguration+0x101d/0x1900 drivers/usb/core/message.c:2170 usbgenericdriverprobe+0xbe/0x100 drivers/usb/core/generic.c:238 usbprobedevice+0xd8/0x2c0 drivers/usb/core/driver.c:293 calldriverprobe drivers/base/dd.c:560 [inline] reallyprobe+0x249/0xb90 drivers/base/dd.c:639 _driverprobedevice+0x1df/0x4d0 drivers/base/dd.c:778 driverprobedevice+0x4c/0x1a0 drivers/base/dd.c:808 _deviceattachdriver+0x1d4/0x2e0 drivers/base/dd.c:936 busforeachdrv+0x163/0x1e0 drivers/base/bus.c:427 _deviceattach+0x1e4/0x530 drivers/base/dd.c:1008 busprobedevice+0x1e8/0x2a0 drivers/base/bus.c:487 deviceadd+0xbd9/0x1e90 drivers/base/core.c:3517 usbnewdevice.cold+0x685/0x10ad drivers/usb/core/hub.c:2573 hubportconnect drivers/usb/core/hub.c:5353 [inline] hubportconnectchange drivers/usb/core/hub.c:5497 [inline] portevent drivers/usb/core/hub.c:5653 [inline] hubevent+0x26cb/0x45d0 drivers/usb/core/hub.c:5735 processonework+0x9bf/0x1710 kernel/workqueue.c:2289 workerthread+0x669/0x1090 kernel/workqueue.c:2436 kthread+0x2e8/0x3a0 kernel/kthread.c:376 retfromfork+0x1f/0x30 arch/x86/entry/entry64.S:306 </TASK>(CVE-2024-38565)
In the Linux kernel, the following vulnerability has been resolved:
bpf: Fix verifier assumptions about socket->sk
The verifier assumes that 'sk' field in 'struct socket' is valid and non-NULL when 'socket' pointer itself is trusted and non-NULL. That may not be the case when socket was just created and passed to LSM socket_accept hook. Fix this verifier assumption and adjust tests.(CVE-2024-38566)
In the Linux kernel, the following vulnerability has been resolved:
epoll: be better about file lifetimes
epoll can call out to vfspoll() with a file pointer that may race with the last 'fput()'. That would make fcount go down to zero, and while the ep->mtx locking means that the resulting file pointer tear-down will be blocked until the poll returns, it means that f_count is already dead, and any use of it won't actually get a reference to the file any more: it's dead regardless.
Make sure we have a valid ref on the file pointer before we call down to vfs_poll() from the epoll routines.(CVE-2024-38580)
In the Linux kernel, the following vulnerability has been resolved:
net: micrel: Fix receiving the timestamp in the frame for lan8841
The blamed commit started to use the ptp workqueue to get the second part of the timestamp. And when the port was set down, then this workqueue is stopped. But if the config option NETWORKPHYTIMESTAMPING is not enabled, then the ptpclock is not initialized so then it would crash when it would try to access the delayed work. So then basically by setting up and then down the port, it would crash. The fix consists in checking if the ptpclock is initialized and only then cancel the delayed work.(CVE-2024-38593)
In the Linux kernel, the following vulnerability has been resolved:
eth: sungem: remove .ndopollcontroller to avoid deadlocks
Erhard reports netpoll warnings from sungem:
netpollsendskbondev(): eth0 enabled interrupts in poll (gemstartxmit+0x0/0x398) WARNING: CPU: 1 PID: 1 at net/core/netpoll.c:370 netpollsendskb+0x1fc/0x20c
gempollcontroller() disables interrupts, which may sleep. We can't sleep in netpoll, it has interrupts disabled completely. Strangely, gempollcontroller() doesn't even poll the completions, and instead acts as if an interrupt has fired so it just schedules NAPI and exits. None of this has been necessary for years, since netpoll invokes NAPI directly.(CVE-2024-38597)
In the Linux kernel, the following vulnerability has been resolved:
media: i2c: et8ek8: Don't strip remove function when driver is builtin
Using _exit for the remove function results in the remove callback being discarded with CONFIGVIDEO_ET8EK8=y. When such a device gets unbound (e.g. using sysfs or hotplug), the driver is just removed without the cleanup being performed. This results in resource leaks. Fix it by compiling in the remove callback unconditionally.
This also fixes a W=1 modpost warning:
WARNING: modpost: drivers/media/i2c/et8ek8/et8ek8: section mismatch in reference: et8ek8_i2c_driver+0x10 (section: .data) -> et8ek8_remove (section: .exit.text)(CVE-2024-38611)
In the Linux kernel, the following vulnerability has been resolved:
m68k: Fix spinlock race in kernel thread creation
Context switching does take care to retain the correct lock owner across the switch from 'prev' to 'next' tasks. This does rely on interrupts remaining disabled for the entire duration of the switch.
This condition is guaranteed for normal process creation and context switching between already running processes, because both 'prev' and 'next' already have interrupts disabled in their saved copies of the status register.
The situation is different for newly created kernel threads. The status register is set to PSS in copythread(), which does leave the IPL at 0. Upon restoring the 'next' thread's status register in switchto() aka resume(), interrupts then become enabled prematurely. resume() then returns via retfromkernelthread() and scheduletail() where run queue lock is released (see finishtaskswitch() and finishlock_switch()).
A timer interrupt calling schedulertick() before the lock is released in finishtask_switch() will find the lock already taken, with the current task as lock owner. This causes a spinlock recursion warning as reported by Guenter Roeck.
As far as I can ascertain, this race has been opened in commit 533e6903bea0 ("m68k: split retfromfork(), simplify kernel_thread()") but I haven't done a detailed study of kernel history so it may well predate that commit.
Interrupts cannot be disabled in the saved status register copy for kernel threads (init will complain about interrupts disabled when finally starting user space). Disable interrupts temporarily when switching the tasks' register sets in resume().
Note that a simple oriw 0x700,%sr after restoring sr is not enough here - this leaves enough of a race for the 'spinlock recursion' warning to still be observed.
Tested on ARAnyM and qemu (Quadra 800 emulation).(CVE-2024-38613)
In the Linux kernel, the following vulnerability has been resolved:
wifi: carl9170: re-fix fortified-memset warning
The carl9170txrelease() function sometimes triggers a fortified-memset warning in my randconfig builds:
In file included from include/linux/string.h:254, from drivers/net/wireless/ath/carl9170/tx.c:40: In function 'fortifymemsetchk', inlined from 'carl9170txrelease' at drivers/net/wireless/ath/carl9170/tx.c:283:2, inlined from 'krefput' at include/linux/kref.h:65:3, inlined from 'carl9170txputskb' at drivers/net/wireless/ath/carl9170/tx.c:342:9: include/linux/fortify-string.h:493:25: error: call to '_writeoverflowfield' declared with attribute warning: detected write beyond size of field (1st parameter); maybe use structgroup()? [-Werror=attribute-warning] 493 | _writeoverflowfield(psize_field, size);
Kees previously tried to avoid this by using memsetafter(), but it seems this does not fully address the problem. I noticed that the memsetafter() here is done on a different part of the union (status) than the original cast was from (ratedriverdata), which may confuse the compiler.
Unfortunately, the memsetafter() trick does not work on driverrates[] because that is part of an anonymous struct, and I could not get struct_group() to do this either. Using two separate memset() calls on the two members does address the warning though.(CVE-2024-38616)
In the Linux kernel, the following vulnerability has been resolved:
stm class: Fix a double free in stmregisterdevice()
The putdevice(&stm->dev) call will trigger stmdevice_release() which frees "stm" so the vfree(stm) on the next line is a double free.(CVE-2024-38627)
In the Linux kernel, the following vulnerability has been resolved:
soundwire: cadence: fix invalid PDI offset
For some reason, we add an offset to the PDI, presumably to skip the PDI0 and PDI1 which are reserved for BPT.
This code is however completely wrong and leads to an out-of-bounds access. We were just lucky so far since we used only a couple of PDIs and remained within the PDI array bounds.
A Fixes: tag is not provided since there are no known platforms where the out-of-bounds would be accessed, and the initial code had problems as well.
A follow-up patch completely removes this useless offset.(CVE-2024-38635)
In the Linux kernel, the following vulnerability has been resolved:
ext4: fix mbcacheentry's erefcnt leak in ext4xattrblockcache_find()
Syzbot reports a warning as follows:
============================================ WARNING: CPU: 0 PID: 5075 at fs/mbcache.c:419 mbcachedestroy+0x224/0x290 Modules linked in: CPU: 0 PID: 5075 Comm: syz-executor199 Not tainted 6.9.0-rc6-gb947cc5bf6d7 RIP: 0010:mbcachedestroy+0x224/0x290 fs/mbcache.c:419 Call Trace: <TASK> ext4putsuper+0x6d4/0xcd0 fs/ext4/super.c:1375 genericshutdownsuper+0x136/0x2d0 fs/super.c:641 killblocksuper+0x44/0x90 fs/super.c:1675 ext4killsb+0x68/0xa0 fs/ext4/super.c:7327
This is because when finding an entry in ext4xattrblockcachefind(), if ext4sbbread() returns -ENOMEM, the ce's erefcnt, which has already grown in the _entryfind(), won't be put away, and eventually trigger the above issue in mbcache_destroy() due to reference count leakage.
So call mbcacheentry_put() on the -ENOMEM error branch as a quick fix.(CVE-2024-39276)
In the Linux kernel, the following vulnerability has been resolved:
mm/memory-failure: fix handling of dissolved but not taken off from buddy pages
When I did memory failure tests recently, below panic occurs:
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00 flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff) raw: 06fffe0000000000 dead000000000100 dead000000000122 0000000000000000 raw: 0000000000000000 0000000000000009 00000000ffffffff 0000000000000000 page dumped because: VMBUGONPAGE(!PageBuddy(page)) ------------[ cut here ]------------ kernel BUG at include/linux/page-flags.h:1009! invalid opcode: 0000 [#1] PREEMPT SMP NOPTI RIP: 0010:delpagefromfreelist+0x151/0x180 RSP: 0018:ffffa49c90437998 EFLAGS: 00000046 RAX: 0000000000000035 RBX: 0000000000000009 RCX: ffff8dd8dfd1c9c8 RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff8dd8dfd1c9c0 RBP: ffffd901233b8000 R08: ffffffffab5511f8 R09: 0000000000008c69 R10: 0000000000003c15 R11: ffffffffab5511f8 R12: ffff8dd8fffc0c80 R13: 0000000000000001 R14: ffff8dd8fffc0c80 R15: 0000000000000009 FS: 00007ff916304740(0000) GS:ffff8dd8dfd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055eae50124c8 CR3: 00000008479e0000 CR4: 00000000000006f0 Call Trace: <TASK> _rmqueuepcplist+0x23b/0x520 getpagefromfreelist+0x26b/0xe40 _allocpagesnoprof+0x113/0x1120 _folioallocnoprof+0x11/0xb0 allocbuddyhugetlbfolio.isra.0+0x5a/0x130 _allocfreshhugetlbfolio+0xe7/0x140 allocpoolhugefolio+0x68/0x100 setmaxhugepages+0x13d/0x340 hugetlbsysctlhandlercommon+0xe8/0x110 procsyscallhandler+0x194/0x280 vfswrite+0x387/0x550 ksyswrite+0x64/0xe0 dosyscall64+0xc2/0x1d0 entrySYSCALL64afterhwframe+0x77/0x7f RIP: 0033:0x7ff916114887 RSP: 002b:00007ffec8a2fd78 EFLAGS: 00000246 ORIGRAX: 0000000000000001 RAX: ffffffffffffffda RBX: 000055eae500e350 RCX: 00007ff916114887 RDX: 0000000000000004 RSI: 000055eae500e390 RDI: 0000000000000003 RBP: 000055eae50104c0 R08: 0000000000000000 R09: 000055eae50104c0 R10: 0000000000000077 R11: 0000000000000246 R12: 0000000000000004 R13: 0000000000000004 R14: 00007ff916216b80 R15: 00007ff916216a00 </TASK> Modules linked in: mceinject hwpoisoninject ---[ end trace 0000000000000000 ]---
And before the panic, there had an warning about bad page state:
BUG: Bad page state in process page-types pfn:8cee00 page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00 flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff) pagetype: 0xffffff7f(buddy) raw: 06fffe0000000000 ffffd901241c0008 ffffd901240f8008 0000000000000000 raw: 0000000000000000 0000000000000009 00000000ffffff7f 0000000000000000 page dumped because: nonzero mapcount Modules linked in: mceinject hwpoisoninject CPU: 8 PID: 154211 Comm: page-types Not tainted 6.9.0-rc4-00499-g5544ec3178e2-dirty #22 Call Trace: <TASK> dumpstacklvl+0x83/0xa0 badpage+0x63/0xf0 freeunrefpage+0x36e/0x5c0 unpoisonmemory+0x50b/0x630 simpleattrwritexsigned.constprop.0.isra.0+0xb3/0x110 debugfsattrwrite+0x42/0x60 fullproxywrite+0x5b/0x80 vfswrite+0xcd/0x550 ksyswrite+0x64/0xe0 dosyscall64+0xc2/0x1d0 entrySYSCALL64afterhwframe+0x77/0x7f RIP: 0033:0x7f189a514887 RSP: 002b:00007ffdcd899718 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f189a514887 RDX: 0000000000000009 RSI: 00007ffdcd899730 RDI: 0000000000000003 RBP: 00007ffdcd8997a0 R08: 0000000000000000 R09: 00007ffdcd8994b2 R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffdcda199a8 R13: 0000000000404af1 R14: 000000000040ad78 R15: 00007f189a7a5040 </TASK>
The root cause should be the below race:
memoryfailure trymemoryfailurehugetlb mehugepage _pagehandlepoison dissolvefreehugetlbfolio drainallpages -- Buddy page can be isolated e.g. for compaction. takepageoff_buddy -- Failed as page is not in the ---truncated---(CVE-2024-39298)
In the Linux kernel, the following vulnerability has been resolved:
md/raid5: fix deadlock that raid5d() wait for itself to clear MDSBCHANGE_PENDING
Xiao reported that lvm2 test lvconvert-raid-takeover.sh can hang with small possibility, the root cause is exactly the same as commit bed9e27baf52 ("Revert "md/raid5: Wait for MDSBCHANGE_PENDING in raid5d"")
However, Dan reported another hang after that, and junxiao investigated the problem and found out that this is caused by plugged bio can't issue from raid5d().
Current implementation in raid5d() has a weird dependence:
1) mdcheckrecovery() from raid5d() must hold 'reconfigmutex' to clear MDSBCHANGEPENDING; 2) raid5d() handles IO in a deadloop, until all IO are issued; 3) IO from raid5d() must wait for MDSBCHANGE_PENDING to be cleared;
This behaviour is introduce before v2.6, and for consequence, if other context hold 'reconfigmutex', and mdcheckrecovery() can't update superblock, then raid5d() will waste one cpu 100% by the deadloop, until 'reconfig_mutex' is released.
Refer to the implementation from raid1 and raid10, fix this problem by skipping issue IO if MDSBCHANGEPENDING is still set after mdcheckrecovery(), daemon thread will be woken up when 'reconfigmutex' is released. Meanwhile, the hang problem will be fixed as well.(CVE-2024-39476)
In the Linux kernel, the following vulnerability has been resolved:
ipv6: sr: fix missing skbuff release in seg6input_core
The seg6input() function is responsible for adding the SRH into a packet, delegating the operation to the seg6inputcore(). This function uses the skbcowhead() to ensure that there is sufficient headroom in the skbuff for accommodating the link-layer header. In the event that the skbcowheader() function fails, the seg6inputcore() catches the error but it does not release the sk_buff, which will result in a memory leak.
This issue was introduced in commit af3b5158b89d ("ipv6: sr: fix BUG due to headroom too small after SRH push") and persists even after commit 7a3f5b0de364 ("netfilter: add netfilter hooks to SRv6 data plane"), where the entire seg6_input() code was refactored to deal with netfilter hooks.
The proposed patch addresses the identified memory leak by requiring the seg6inputcore() function to release the skbuff in the event that skbcow_head() fails.(CVE-2024-39490)
In the Linux kernel, the following vulnerability has been resolved:
ALSA: hda: cs35l56: Fix lifetime of cs_dsp instance
The csdsp instance is initialized in the driver probe() so it should be freed in the driver remove(). Also fix a missing call to csdspremove() in the error path of cs35l56hdacommonprobe().
The call to csdspremove() was being done in the component unbind callback cs35l56hdaunbind(). This meant that if the driver was unbound and then re-bound it would be using an uninitialized cs_dsp instance.
It is best to initialize the csdsp instance in probe() so that it can return an error if it fails. The component binding API doesn't have any error handling so there's no way to handle a failure if csdsp was initialized in the bind.(CVE-2024-39491)
In the Linux kernel, the following vulnerability has been resolved:
drivers: core: synchronize reallyprobe() and devuevent()
Synchronize the dev->driver usage in reallyprobe() and devuevent(). These can run in different threads, what can result in the following race condition for dev->driver uninitialization:
reallyprobe() { ... probefailed: ... deviceunbindcleanup(dev) { ... dev->driver = NULL; // <= Failed probe sets dev->driver to NULL ... } ... }
devuevent() { ... if (dev->driver) // If dev->driver is NULLed from reallyprobe() from here on, // after above check, the system crashes addueventvar(env, "DRIVER=%s", dev->driver->name); ... }
reallyprobe() holds the lock, already. So nothing needs to be done there. devuevent() is called with lock held, often, too. But not always. What implies that we can't add any locking in dev_uevent() itself. So fix this race by adding the lock to the non-protected path. This is the path where above race is observed:
devuevent+0x235/0x380 ueventshow+0x10c/0x1f0 <= Add lock here devattrshow+0x3a/0xa0 sysfskfseqshow+0x17c/0x250 kernfsseqshow+0x7c/0x90 seqreaditer+0x2d7/0x940 kernfsfopreaditer+0xc6/0x310 vfsread+0x5bc/0x6b0 ksysread+0xeb/0x1b0 _x64sysread+0x42/0x50 x64syscall+0x27ad/0x2d30 dosyscall64+0xcd/0x1d0 entrySYSCALL64after_hwframe+0x77/0x7f
Similar cases are reported by syzkaller in
https://syzkaller.appspot.com/bug?extid=ffa8143439596313a85a
But these are regarding the initialization of dev->driver
dev->driver = drv;
As this switches dev->driver to non-NULL these reports can be considered to be false-positives (which should be "fixed" by this commit, as well, though).
The same issue was reported and tried to be fixed back in 2015 in
https://lore.kernel.org/lkml/1421259054-2574-1-git-send-email-a.sangwan@samsung.com/
already.(CVE-2024-39501)
In the Linux kernel, the following vulnerability has been resolved:
netfilter: nft_inner: validate mandatory meta and payload
Check for mandatory netlink attributes in payload and meta expression when used embedded from the inner expression, otherwise NULL pointer dereference is possible from userspace.(CVE-2024-39504)
In the Linux kernel, the following vulnerability has been resolved:
scsi: mpt3sas: Avoid test/set_bit() operating in non-allocated memory
There is a potential out-of-bounds access when using testbit() on a single word. The testbit() and set_bit() functions operate on long values, and when testing or setting a single word, they can exceed the word boundary. KASAN detects this issue and produces a dump:
BUG: KASAN: slab-out-of-bounds in _scsih_add_device.constprop.0 (./arch/x86/include/asm/bitops.h:60 ./include/asm-generic/bitops/instrumented-atomic.h:29 drivers/scsi/mpt3sas/mpt3sas_scsih.c:7331) mpt3sas
Write of size 8 at addr ffff8881d26e3c60 by task kworker/u1536:2/2965
For full log, please look at [1].
Make the allocation at least the size of sizeof(unsigned long) so that setbit() and testbit() have sufficient room for read/write operations without overwriting unallocated memory.
[1] Link: https://lore.kernel.org/all/ZkNcALr3W3KGYYJG@gmail.com/(CVE-2024-40901)
In the Linux kernel, the following vulnerability has been resolved:
ax25: Fix refcount imbalance on inbound connections
When releasing a socket in ax25release(), we call netdevput() to decrease the refcount on the associated ax.25 device. However, the execution path for accepting an incoming connection never calls netdev_hold(). This imbalance leads to refcount errors, and ultimately to kernel crashes.
A typical call trace for the above situation will start with one of the following errors:
refcount_t: decrement hit 0; leaking memory.
refcount_t: underflow; use-after-free.
And will then have a trace like:
Call Trace:
<TASK>
? show_regs+0x64/0x70
? __warn+0x83/0x120
? refcount_warn_saturate+0xb2/0x100
? report_bug+0x158/0x190
? prb_read_valid+0x20/0x30
? handle_bug+0x3e/0x70
? exc_invalid_op+0x1c/0x70
? asm_exc_invalid_op+0x1f/0x30
? refcount_warn_saturate+0xb2/0x100
? refcount_warn_saturate+0xb2/0x100
ax25_release+0x2ad/0x360
__sock_release+0x35/0xa0
sock_close+0x19/0x20
[...]
On reboot (or any attempt to remove the interface), the kernel gets stuck in an infinite loop:
unregister_netdevice: waiting for ax0 to become free. Usage count = 0
This patch corrects these issues by ensuring that we call netdevhold() and ax25devhold() for new connections in ax25accept(). This makes the logic leading to ax25accept() match the logic for ax25bind(): in both cases we increment the refcount, which is ultimately decremented in ax25_release().(CVE-2024-40910)
In the Linux kernel, the following vulnerability has been resolved:
wifi: cfg80211: Lock wiphy in cfg80211getstation
Wiphy should be locked before calling rdevgetstation() (see lockdep assert in ieee80211getstation()).
This fixes the following kernel NULL dereference:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000050 Mem abort info: ESR = 0x0000000096000006 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x06: level 2 translation fault Data abort info: ISV = 0, ISS = 0x00000006 CM = 0, WnR = 0 user pgtable: 4k pages, 48-bit VAs, pgdp=0000000003001000 [0000000000000050] pgd=0800000002dca003, p4d=0800000002dca003, pud=08000000028e9003, pmd=0000000000000000 Internal error: Oops: 0000000096000006 [#1] SMP Modules linked in: netconsole dwc3mesong12a dwc3ofsimple dwc3 ipgre gre ath10kpci ath10kcore ath9k ath9kcommon ath9khw ath CPU: 0 PID: 1091 Comm: kworker/u8:0 Not tainted 6.4.0-02144-g565f9a3a7911-dirty #705 Hardware name: RPT (r1) (DT) Workqueue: batevents batadvvelpthroughputmetricupdate pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : ath10kstastatistics+0x10/0x2dc [ath10kcore] lr : stasetsinfo+0xcc/0xbd4 sp : ffff000007b43ad0 x29: ffff000007b43ad0 x28: ffff0000071fa900 x27: ffff00000294ca98 x26: ffff000006830880 x25: ffff000006830880 x24: ffff00000294c000 x23: 0000000000000001 x22: ffff000007b43c90 x21: ffff800008898acc x20: ffff00000294c6e8 x19: ffff000007b43c90 x18: 0000000000000000 x17: 445946354d552d78 x16: 62661f7200000000 x15: 57464f445946354d x14: 0000000000000000 x13: 00000000000000e3 x12: d5f0acbcebea978e x11: 00000000000000e3 x10: 000000010048fe41 x9 : 0000000000000000 x8 : ffff000007b43d90 x7 : 000000007a1e2125 x6 : 0000000000000000 x5 : ffff0000024e0900 x4 : ffff800000a0250c x3 : ffff000007b43c90 x2 : ffff00000294ca98 x1 : ffff000006831920 x0 : 0000000000000000 Call trace: ath10kstastatistics+0x10/0x2dc [ath10kcore] stasetsinfo+0xcc/0xbd4 ieee80211getstation+0x2c/0x44 cfg80211getstation+0x80/0x154 batadvvelpgetthroughput+0x138/0x1fc batadvvelpthroughputmetricupdate+0x1c/0xa4 processonework+0x1ec/0x414 workerthread+0x70/0x46c kthread+0xdc/0xe0 retfrom_fork+0x10/0x20 Code: a9bb7bfd 910003fd a90153f3 f9411c40 (f9402814)
This happens because STA has time to disconnect and reconnect before batadvvelpthroughputmetricupdate() delayed work gets scheduled. In this situation, ath10ksta_state() can be in the middle of resetting arsta data when the work queue get chance to be scheduled and ends up accessing it. Locking wiphy prevents that.(CVE-2024-40911)
In the Linux kernel, the following vulnerability has been resolved:
mm/hugememory: don't unpoison hugezero_folio
When I did memory failure tests recently, below panic occurs:
kernel BUG at include/linux/mm.h:1135! invalid opcode: 0000 [#1] PREEMPT SMP NOPTI CPU: 9 PID: 137 Comm: kswapd1 Not tainted 6.9.0-rc4-00491-gd5ce28f156fe-dirty #14 RIP: 0010:shrinkhugezeropagescan+0x168/0x1a0 RSP: 0018:ffff9933c6c57bd0 EFLAGS: 00000246 RAX: 000000000000003e RBX: 0000000000000000 RCX: ffff88f61fc5c9c8 RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff88f61fc5c9c0 RBP: ffffcd7c446b0000 R08: ffffffff9a9405f0 R09: 0000000000005492 R10: 00000000000030ea R11: ffffffff9a9405f0 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: ffff88e703c4ac00 FS: 0000000000000000(0000) GS:ffff88f61fc40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055f4da6e9878 CR3: 0000000c71048000 CR4: 00000000000006f0 Call Trace: <TASK> doshrinkslab+0x14f/0x6a0 shrinkslab+0xca/0x8c0 shrinknode+0x2d0/0x7d0 balancepgdat+0x33a/0x720 kswapd+0x1f3/0x410 kthread+0xd5/0x100 retfromfork+0x2f/0x50 retfromforkasm+0x1a/0x30 </TASK> Modules linked in: mceinject hwpoisoninject ---[ end trace 0000000000000000 ]--- RIP: 0010:shrinkhugezeropagescan+0x168/0x1a0 RSP: 0018:ffff9933c6c57bd0 EFLAGS: 00000246 RAX: 000000000000003e RBX: 0000000000000000 RCX: ffff88f61fc5c9c8 RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff88f61fc5c9c0 RBP: ffffcd7c446b0000 R08: ffffffff9a9405f0 R09: 0000000000005492 R10: 00000000000030ea R11: ffffffff9a9405f0 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: ffff88e703c4ac00 FS: 0000000000000000(0000) GS:ffff88f61fc40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055f4da6e9878 CR3: 0000000c71048000 CR4: 00000000000006f0
The root cause is that HWPoison flag will be set for hugezerofolio without increasing the folio refcnt. But then unpoisonmemory() will decrease the folio refcnt unexpectedly as it appears like a successfully hwpoisoned folio leading to VMBUGONPAGE(pagerefcount(page) == 0) when releasing hugezerofolio.
Skip unpoisoning hugezerofolio in unpoisonmemory() to fix this issue. We're not prepared to unpoison hugezero_folio yet.(CVE-2024-40914)
In the Linux kernel, the following vulnerability has been resolved:
bnxten: Adjust logging of firmware messages in case of released token in _hwrm_send()
In case of token is released due to token->state == BNXTHWRMDEFERRED, released token (set to NULL) is used in log messages. This issue is expected to be prevented by HWRMERRCODEPFUNAVAILABLE error code. But this error code is returned by recent firmware. So some firmware may not return it. This may lead to NULL pointer dereference. Adjust this issue by adding token pointer check.
Found by Linux Verification Center (linuxtesting.org) with SVACE.(CVE-2024-40919)
In the Linux kernel, the following vulnerability has been resolved:
block: fix request.queuelist usage in flush
Friedrich Weber reported a kernel crash problem and bisected to commit 81ada09cc25e ("blk-flush: reuse rq queuelist in flush state machine").
The root cause is that we use "listmovetail(&rq->queuelist, pending)" in the PREFLUSH/POSTFLUSH sequences. But rq->queuelist.next == xxx since it's popped out from plug->cachedrq in _blkmqallocrequestsbatch(). We don't initialize its queuelist just for this first request, although the queuelist of all later popped requests will be initialized.
Fix it by changing to use "listaddtail(&rq->queuelist, pending)" so rq->queuelist doesn't need to be initialized. It should be ok since rq can't be on any list when PREFLUSH or POSTFLUSH, has no move actually.
Please note the commit 81ada09cc25e ("blk-flush: reuse rq queuelist in flush state machine") also has another requirement that no drivers would touch rq->queuelist after blkmqend_request() since we will reuse it to add rq to the post-flush pending list in POSTFLUSH. If this is not true, we will have to revert that commit IMHO.
This updated version adds "listdelinit(&rq->queuelist)" in flush rq callback since the dm layer may submit request of a weird invalid format (REQFSEQPREFLUSH | REQFSEQPOSTFLUSH), which causes double listadd if without this "listdel_init(&rq->queuelist)". The weird invalid format problem should be fixed in dm layer.(CVE-2024-40925)
In the Linux kernel, the following vulnerability has been resolved:
net: ethtool: fix the error condition in ethtoolgetphystatsethtool()
Clang static checker (scan-build) warning: net/ethtool/ioctl.c:line 2233, column 2 Called function pointer is null (null dereference).
Return '-EOPNOTSUPP' when 'ops->getethtoolphy_stats' is NULL to fix this typo error.(CVE-2024-40928)
In the Linux kernel, the following vulnerability has been resolved:
landlock: Fix d_parent walk
The WARNONONCE() in collectdomainaccesses() can be triggered when trying to link a root mount point. This cannot work in practice because this directory is mounted, but the VFS check is done after the call to securitypathlink().
Do not use source directory's d_parent when the source directory is the mount point.
In the Linux kernel, the following vulnerability has been resolved:
net: wwan: iosm: Fix tainted pointer delete is case of region creation fail
In case of region creation fail in ipcdevlinkcreate_region(), previously created regions delete process starts from tainted pointer which actually holds error code value. Fix this bug by decreasing region index before delete.
Found by Linux Verification Center (linuxtesting.org) with SVACE.(CVE-2024-40939)
In the Linux kernel, the following vulnerability has been resolved:
net/mlx5: Fix tainted pointer delete is case of flow rules creation fail
In case of flow rule creation fail in mlx5lagcreateportsel_table(), instead of previously created rules, the tainted pointer is deleted deveral times. Fix this bug by using correct flow rules pointers.
Found by Linux Verification Center (linuxtesting.org) with SVACE.(CVE-2024-40940)
In the Linux kernel, the following vulnerability has been resolved:
x86/kexec: Fix bug with call depth tracking
The call to ccplatformhas() triggers a fault and system crash if call depth tracking is active because the GS segment has been reset by loadsegments() and GSBASE is now 0 but call depth tracking uses per-CPU variables to operate.
Call ccplatformhas() earlier in the function when GS is still valid.
In the Linux kernel, the following vulnerability has been resolved:
iommu: Return right value in iommusvabind_device()
iommusvabinddevice() should return either a sva bond handle or an ERRPTR value in error cases. Existing drivers (idxd and uacce) only check the return value with IS_ERR(). This could potentially lead to a kernel NULL pointer dereference issue if the function returns NULL instead of an error pointer.
In reality, this doesn't cause any problems because iommusvabinddevice() only returns NULL when the kernel is not configured with CONFIGIOMMUSVA. In this case, iommudevenablefeature(dev, IOMMUDEVFEATSVA) will return an error, and the device drivers won't call iommusvabinddevice() at all.(CVE-2024-40945)
In the Linux kernel, the following vulnerability has been resolved:
mm/pagetablecheck: fix crash on ZONE_DEVICE
Not all pages may apply to pgtable check. One example is ZONEDEVICE pages: they map PFNs directly, and they don't allocate pageext at all even if there's struct page around. One may reference devmmemremappages().
When both ZONE_DEVICE and page-table-check enabled, then try to map some dax memories, one can trigger kernel bug constantly now when the kernel was trying to inject some pfn maps on the dax device:
kernel BUG at mm/pagetablecheck.c:55!
While it's pretty legal to use setpxxat() for ZONEDEVICE pages for page fault resolutions, skip all the checks if pageext doesn't even exist in pgtable checker, which applies to ZONE_DEVICE but maybe more.(CVE-2024-40948)
In the Linux kernel, the following vulnerability has been resolved:
mm: hugememory: fix misused mappinglargefoliosupport() for anon folios
When I did a large folios split test, a WARNING "[ 5059.122759][ T166] Cannot split file folio to non-0 order" was triggered. But the test cases are only for anonmous folios. while mappinglargefolio_support() is only reasonable for page cache folios.
In splithugepagetolisttoorder(), the folio passed to mappinglargefoliosupport() maybe anonmous folio. The foliotestanon() check is missing. So the split of the anonmous THP is failed. This is also the same for shmemmapping(). We'd better add a check for both. But the shmemmapping() in _splithugepage() is not involved, as for anonmous folios, the end parameter is set to -1, so (head[i].index >= end) is always false. shmem_mapping() is not called.
Also add a VMWARNONONCE() in mappinglargefoliosupport() for anon mapping, So we can detect the wrong use more easily.
THP folios maybe exist in the pagecache even the file system doesn't support large folio, it is because when CONFIGTRANSPARENTHUGEPAGE is enabled, khugepaged will try to collapse read-only file-backed pages to THP. But the mapping does not actually support multi order large folios properly.
Using /sys/kernel/debug/splithugepages to verify this, with this patch, large anon THP is successfully split and the warning is ceased.(CVE-2024-40950)
In the Linux kernel, the following vulnerability has been resolved:
ext4: fix slab-out-of-bounds in ext4mbfindgoodgroupavgfrag_lists()
We can trigger a slab-out-of-bounds with the following commands:
mkfs.ext4 -F /dev/$disk 10G
mount /dev/$disk /tmp/test
echo 2147483647 > /sys/fs/ext4/$disk/mb_group_prealloc
echo test > /tmp/test/file && sync
================================================================== BUG: KASAN: slab-out-of-bounds in ext4mbfindgoodgroupavgfraglists+0x8a/0x200 [ext4] Read of size 8 at addr ffff888121b9d0f0 by task kworker/u2:0/11 CPU: 0 PID: 11 Comm: kworker/u2:0 Tainted: GL 6.7.0-next-20240118 #521 Call Trace: dumpstacklvl+0x2c/0x50 kasanreport+0xb6/0xf0 ext4mbfindgoodgroupavgfraglists+0x8a/0x200 [ext4] ext4mbregularallocator+0x19e9/0x2370 [ext4] ext4mbnewblocks+0x88a/0x1370 [ext4] ext4extmapblocks+0x14f7/0x2390 [ext4] ext4mapblocks+0x569/0xea0 [ext4] ext4dowritepages+0x10f6/0x1bc0 [ext4]
The flow of issue triggering is as follows:
// Set smbgroupprealloc to 2147483647 via sysfs ext4mbnewblocks ext4mbnormalizerequest ext4mbnormalizegrouprequest ac->acgex.felen = EXT4SB(sb)->smbgroupprealloc ext4mbregularallocator ext4mbchoosenextgroup ext4mbchoosenextgroupbestavail mbavgfragmentsizeorder order = fls(len) - 2 = 29 ext4mbfindgoodgroupavgfraglists fraglist = &sbi->smbavgfragmentsize[order] if (listempty(frag_list)) // Trigger SOOB!
At 4k block size, the length of the smbavgfragmentsize list is 14, but an oversized smbgroup_prealloc is set, causing slab-out-of-bounds to be triggered by an attempt to access an element at index 29.
Add a new attrid attrclustersingroup with values in the range [0, sbi->sclusterspergroup] and declare mbgroupprealloc as that type to fix the issue. In addition avoid returning an order from mbavgfragmentsizeorder() greater than MBNUM_ORDERS(sb) and reduce some useless loops.(CVE-2024-40955)
In the Linux kernel, the following vulnerability has been resolved:
tty: add the option to have a tty reject a new ldisc
... and use it to limit the virtual terminals to just NTTY. They are kind of special, and in particular, the "conwrite()" routine violates the "writes cannot sleep" rule that some ldiscs rely on.
This avoids the
BUG: sleeping function called from invalid context at kernel/printk/printk.c:2659
when NGSM has been attached to a virtual console, and gsmldwrite() calls conwrite() while holding a spinlock, and conwrite() then tries to get the console lock.(CVE-2024-40966)
In the Linux kernel, the following vulnerability has been resolved:
f2fs: don't set RO when shutting down f2fs
Shutdown does not check the error of thaw_super due to readonly, which causes a deadlock like below.
f2fsiocshutdown(F2FSGOINGDOWNFULLSYNC) issuediscardthread - bdevfreeze - freezesuper - f2fsstopcheckpoint() - f2fshandlecriticalerror - sbstartwrite - set RO - waiting - bdevthaw - thawsuperlocked - return -EINVAL, if sbrdonly() - f2fsstopdiscardthread -> wait for kthreadstop(discard_thread);(CVE-2024-40969)
In the Linux kernel, the following vulnerability has been resolved:
Avoid hw_desc array overrun in dw-axi-dmac
I have a use case where nrbuffers = 3 and in which each descriptor is composed by 3 segments, resulting in the DMA channel descsallocated to be 9. Since axidescput() handles the hwdesc considering the descsallocated, this scenario would result in a kernel panic (hw_desc array will be overrun).
To fix this, the proposal is to add a new member to the axidmadesc structure, where we keep the number of allocated hwdescs (axidescalloc()) and use it in axidescput() to handle the hwdesc array correctly.
Additionally I propose to remove the axichanstartfirstqueued() call after completing the transfer, since it was identified that unbalance can occur (started descriptors can be interrupted and transfer ignored due to DMA channel not being enabled).(CVE-2024-40970)
In the Linux kernel, the following vulnerability has been resolved:
drm/radeon: fix UBSAN warning in kv_dpm.c
Adds bounds check for sumovidmapping_entry.(CVE-2024-40988)
In the Linux kernel, the following vulnerability has been resolved:
RDMA/rxe: Fix responder length checking for UD request packets
According to the IBA specification: If a UD request packet is detected with an invalid length, the request shall be an invalid request and it shall be silently dropped by the responder. The responder then waits for a new request packet.
commit 689c5421bfe0 ("RDMA/rxe: Fix incorrect responder length checking")
defers responder length check for UD QPs in function copy_data
.
But it introduces a regression issue for UD QPs.
When the packet size is too large to fit in the receive buffer.
copy_data
will return error code -EINVAL. Then send_data_in
will return RESPSTERRMALFORMED_WQE. UD QP will transfer into
ERROR state.(CVE-2024-40992)
In the Linux kernel, the following vulnerability has been resolved:
ptp: fix integer overflow in maxvclocksstore
On 32bit systems, the "4 * max" multiply can overflow. Use kcalloc() to do the allocation to prevent this.(CVE-2024-40994)
In the Linux kernel, the following vulnerability has been resolved:
bpf: Avoid splat in pskbpullreason
syzkaller builds (CONFIGDEBUGNET=y) frequently trigger a debug hint in pskbmaypull.
We'd like to retain this debug check because it might hint at integer overflows and other issues (kernel code should pull headers, not huge value).
In bpf case, this splat isn't interesting at all: such (nonsensical) bpf programs are typically generated by a fuzzer anyway.
Do what Eric suggested and suppress such warning.
For CONFIGDEBUGNET=n we don't need the extra check because pskbmaypull will do the right thing: return an error without the WARN() backtrace.(CVE-2024-40996)
In the Linux kernel, the following vulnerability has been resolved:
ocfs2: add bounds checking to ocfs2checkdir_entry()
This adds sanity checks for ocfs2direntry to make sure all members of ocfs2direntry don't stray beyond valid memory region.(CVE-2024-41015)
In the Linux kernel, the following vulnerability has been resolved:
misc: fastrpc: Fix memory leak in audio daemon attach operation
Audio PD daemon send the name as part of the init IOCTL call. This name needs to be copied to kernel for which memory is allocated. This memory is never freed which might result in memory leak. Free the memory when it is not needed.(CVE-2024-41025)
In the Linux kernel, the following vulnerability has been resolved:
platform/x86: toshiba_acpi: Fix array out-of-bounds access
In order to use toshibadmiquirks[] together with the standard DMI matching functions, it must be terminated by a empty entry.
Since this entry is missing, an array out-of-bounds access occurs every time the quirk list is processed.
Fix this by adding the terminating empty entry.(CVE-2024-41028)
In the Linux kernel, the following vulnerability has been resolved:
ksmbd: discard write access to the directory open
may_open() does not allow a directory to be opened with the write access. However, some writing flags set by client result in adding write access on server, making ksmbd incompatible with FUSE file system. Simply, let's discard the write access when opening a directory.
listadd corruption. next is NULL. ------------[ cut here ]------------ kernel BUG at lib/listdebug.c:26! pc : _listaddvalid+0x88/0xbc lr : _listaddvalid+0x88/0xbc Call trace: _listaddvalid+0x88/0xbc fusefinishopen+0x11c/0x170 fuseopencommon+0x284/0x5e8 fusediropen+0x14/0x24 dodentryopen+0x2a4/0x4e0 dentryopen+0x50/0x80 smb2open+0xbe4/0x15a4 handleksmbdwork+0x478/0x5ec processonework+0x1b4/0x448 workerthread+0x25c/0x430 kthread+0x104/0x1d4 retfromfork+0x10/0x20(CVE-2024-41030)
In the Linux kernel, the following vulnerability has been resolved:
mm/filemap: skip to create PMD-sized page cache if needed
On ARM64, HPAGEPMDORDER is 13 when the base page size is 64KB. The PMD-sized page cache can't be supported by xarray as the following error messages indicate.
------------[ cut here ]------------ WARNING: CPU: 35 PID: 7484 at lib/xarray.c:1025 xassplitalloc+0xf8/0x128 Modules linked in: nftfibinet nftfibipv4 nftfibipv6 nftfib \ nftrejectinet nfrejectipv4 nfrejectipv6 nftreject nftct \ nftchainnat nfnat nfconntrack nfdefragipv6 nfdefragipv4 \ ipset rfkill nftables nfnetlink vfat fat virtioballoon drm \ fuse xfs libcrc32c crct10difce ghashce sha2ce sha256arm64 \ sha1ce virtionet netfailover virtioconsole virtioblk failover \ dimlib virtiommio CPU: 35 PID: 7484 Comm: test Kdump: loaded Tainted: G W 6.10.0-rc5-gavin+ #9 Hardware name: QEMU KVM Virtual Machine, BIOS edk2-20240524-1.el9 05/24/2024 pstate: 83400005 (Nzcv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--) pc : xassplitalloc+0xf8/0x128 lr : splithugepagetolisttoorder+0x1c4/0x720 sp : ffff800087a4f6c0 x29: ffff800087a4f6c0 x28: ffff800087a4f720 x27: 000000001fffffff x26: 0000000000000c40 x25: 000000000000000d x24: ffff00010625b858 x23: ffff800087a4f720 x22: ffffffdfc0780000 x21: 0000000000000000 x20: 0000000000000000 x19: ffffffdfc0780000 x18: 000000001ff40000 x17: 00000000ffffffff x16: 0000018000000000 x15: 51ec004000000000 x14: 0000e00000000000 x13: 0000000000002000 x12: 0000000000000020 x11: 51ec000000000000 x10: 51ece1c0ffff8000 x9 : ffffbeb961a44d28 x8 : 0000000000000003 x7 : ffffffdfc0456420 x6 : ffff0000e1aa6eb8 x5 : 20bf08b4fe778fca x4 : ffffffdfc0456420 x3 : 0000000000000c40 x2 : 000000000000000d x1 : 000000000000000c x0 : 0000000000000000 Call trace: xassplitalloc+0xf8/0x128 splithugepagetolisttoorder+0x1c4/0x720 truncateinodepartialfolio+0xdc/0x160 truncateinodepagesrange+0x1b4/0x4a8 truncatepagecacherange+0x84/0xa0 xfsflushunmaprange+0x70/0x90 [xfs] xfsfilefallocate+0xfc/0x4d8 [xfs] vfsfallocate+0x124/0x2e8 ksysfallocate+0x4c/0xa0 _arm64sysfallocate+0x24/0x38 invokesyscall.constprop.0+0x7c/0xd8 doel0svc+0xb4/0xd0 el0svc+0x44/0x1d8 el0t64synchandler+0x134/0x150 el0t64_sync+0x17c/0x180
Fix it by skipping to allocate PMD-sized page cache when its size is larger than MAXPAGECACHEORDER. For this specific case, we will fall to regular path where the readahead window is determined by BDI's sysfs file (readaheadkb).(CVE-2024-41031)
In the Linux kernel, the following vulnerability has been resolved:
net: ks8851: Fix deadlock with the SPI chip variant
When SMP is enabled and spinlocks are actually functional then there is a deadlock with the 'statelock' spinlock between ks8851startxmitspi and ks8851irq:
watchdog: BUG: soft lockup - CPU#0 stuck for 27s!
call trace:
queued_spin_lock_slowpath+0x100/0x284
do_raw_spin_lock+0x34/0x44
ks8851_start_xmit_spi+0x30/0xb8
ks8851_start_xmit+0x14/0x20
netdev_start_xmit+0x40/0x6c
dev_hard_start_xmit+0x6c/0xbc
sch_direct_xmit+0xa4/0x22c
__qdisc_run+0x138/0x3fc
qdisc_run+0x24/0x3c
net_tx_action+0xf8/0x130
handle_softirqs+0x1ac/0x1f0
__do_softirq+0x14/0x20
____do_softirq+0x10/0x1c
call_on_irq_stack+0x3c/0x58
do_softirq_own_stack+0x1c/0x28
__irq_exit_rcu+0x54/0x9c
irq_exit_rcu+0x10/0x1c
el1_interrupt+0x38/0x50
el1h_64_irq_handler+0x18/0x24
el1h_64_irq+0x64/0x68
__netif_schedule+0x6c/0x80
netif_tx_wake_queue+0x38/0x48
ks8851_irq+0xb8/0x2c8
irq_thread_fn+0x2c/0x74
irq_thread+0x10c/0x1b0
kthread+0xc8/0xd8
ret_from_fork+0x10/0x20
This issue has not been identified earlier because tests were done on a device with SMP disabled and so spinlocks were actually NOPs.
Now use spin(un)lockbh for TX queue related locking to avoid execution of softirq work synchronously that would lead to a deadlock.(CVE-2024-41036)
In the Linux kernel, the following vulnerability has been resolved:
firmware: cs_dsp: Prevent buffer overrun when processing V2 alg headers
Check that all fields of a V2 algorithm header fit into the available firmware data buffer.
The wmfw V2 format introduced variable-length strings in the algorithm block header. This means the overall header length is variable, and the position of most fields varies depending on the length of the string fields. Each field must be checked to ensure that it does not overflow the firmware data buffer.
As this ia bugfix patch, the fixes avoid making any significant change to the existing code. This makes it easier to review and less likely to introduce new bugs.(CVE-2024-41038)
In the Linux kernel, the following vulnerability has been resolved:
i40e: Fix XDP program unloading while removing the driver
The commit 6533e558c650 ("i40e: Fix reset path while removing the driver") introduced a new PF state "_I40EINREMOVE" to block modifying the XDP program while the driver is being removed. Unfortunately, such a change is useful only if the ".ndobpf()" callback was called out of the rmmod context because unloading the existing XDP program is also a part of driver removing procedure. In other words, from the rmmod context the driver is expected to unload the XDP program without reporting any errors. Otherwise, the kernel warning with callstack is printed out to dmesg.
Example failing scenario: 1. Load the i40e driver. 2. Load the XDP program. 3. Unload the i40e driver (using "rmmod" command).
The example kernel warning log:
[ +0.004646] WARNING: CPU: 94 PID: 10395 at net/core/dev.c:9290 unregisternetdevicemanynotify+0x7a9/0x870 [...] [ +0.010959] RIP: 0010:unregisternetdevicemanynotify+0x7a9/0x870 [...] [ +0.002726] Call Trace: [ +0.002457] <TASK> [ +0.002119] ? _warn+0x80/0x120 [ +0.003245] ? unregisternetdevicemanynotify+0x7a9/0x870 [ +0.005586] ? reportbug+0x164/0x190 [ +0.003678] ? handlebug+0x3c/0x80 [ +0.003503] ? excinvalidop+0x17/0x70 [ +0.003846] ? asmexcinvalidop+0x1a/0x20 [ +0.004200] ? unregisternetdevicemanynotify+0x7a9/0x870 [ +0.005579] ? unregisternetdevicemanynotify+0x3cc/0x870 [ +0.005586] unregisternetdevicequeue+0xf7/0x140 [ +0.004806] unregisternetdev+0x1c/0x30 [ +0.003933] i40evsirelease+0x87/0x2f0 [i40e] [ +0.004604] i40eremove+0x1a1/0x420 [i40e] [ +0.004220] pcideviceremove+0x3f/0xb0 [ +0.003943] devicereleasedriverinternal+0x19f/0x200 [ +0.005243] driverdetach+0x48/0x90 [ +0.003586] busremovedriver+0x6d/0xf0 [ +0.003939] pciunregisterdriver+0x2e/0xb0 [ +0.004278] i40eexitmodule+0x10/0x5f0 [i40e] [ +0.004570] _dosysdeletemodule.isra.0+0x197/0x310 [ +0.005153] dosyscall64+0x85/0x170 [ +0.003684] ? syscallexittousermode+0x69/0x220 [ +0.004886] ? dosyscall64+0x95/0x170 [ +0.003851] ? excpagefault+0x7e/0x180 [ +0.003932] entrySYSCALL64afterhwframe+0x71/0x79 [ +0.005064] RIP: 0033:0x7f59dc9347cb [ +0.003648] Code: 73 01 c3 48 8b 0d 65 16 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 35 16 0c 00 f7 d8 64 89 01 48 [ +0.018753] RSP: 002b:00007ffffac99048 EFLAGS: 00000206 ORIGRAX: 00000000000000b0 [ +0.007577] RAX: ffffffffffffffda RBX: 0000559b9bb2f6e0 RCX: 00007f59dc9347cb [ +0.007140] RDX: 0000000000000000 RSI: 0000000000000800 RDI: 0000559b9bb2f748 [ +0.007146] RBP: 00007ffffac99070 R08: 1999999999999999 R09: 0000000000000000 [ +0.007133] R10: 00007f59dc9a5ac0 R11: 0000000000000206 R12: 0000000000000000 [ +0.007141] R13: 00007ffffac992d8 R14: 0000559b9bb2f6e0 R15: 0000000000000000 [ +0.007151] </TASK> [ +0.002204] ---[ end trace 0000000000000000 ]---
Fix this by checking if the XDP program is being loaded or unloaded. Then, block only loading a new program while "I40EINREMOVE" is set. Also, move testing "I40EINREMOVE" flag to the beginning of XDP_SETUP callback to avoid unnecessary operations and checks.(CVE-2024-41047)
In the Linux kernel, the following vulnerability has been resolved:
cachefiles: cyclic allocation of msg_id to avoid reuse
Reusing the msg_id after a maliciously completed reopen request may cause a read request to remain unprocessed and result in a hung, as shown below:
cachefilesondemandselectreq cachefilesondemandobjectisclose(A) cachefilesondemandsetobjectreopening(A) queuework(fscacheobjectwq, &info->work) ondemandobjectworker cachefilesondemandinitobject(A) cachefilesondemandsendreq(OPEN) // get msgid 6 waitforcompletion(&reqA->done) cachefilesondemanddaemonread // read msgid 6 reqA cachefilesondemandgetfd copytouser // Malicious completion msgid 6 copen 6,-1 cachefilesondemandcopen complete(&reqA->done) // will not set the object to close // because ondemand_id && fd is valid.
// ondemand_object_worker() is done
// but the object is still reopening.
// new open req_B
cachefiles_ondemand_init_object(B)
cachefiles_ondemand_send_req(OPEN)
// reuse msg_id 6
processopenreq copen 6,A.size // The expected failed copen was executed successfully
Expect copen to fail, and when it does, it closes fd, which sets the object to close, and then close triggers reopen again. However, due to msg_id reuse resulting in a successful copen, the anonymous fd is not closed until the daemon exits. Therefore read requests waiting for reopen to complete may trigger hung task.
To avoid this issue, allocate the msgid cyclically to avoid reusing the msgid for a very short duration of time.(CVE-2024-41050)
In the Linux kernel, the following vulnerability has been resolved:
cachefiles: wait for ondemandobjectworker to finish when dropping object
When queuing ondemandobjectworker() to re-open the object, cachefilesobject is not pinned. The cachefilesobject may be freed when the pending read request is completed intentionally and the related erofs is umounted. If ondemandobjectworker() runs after the object is freed, it will incur use-after-free problem as shown below.
process A processs B process C process D
cachefilesondemandsend_req() // send a read req X // wait for its completion
// close ondemand fd
cachefiles_ondemand_fd_release()
// set object as CLOSE
cachefiles_ondemand_daemon_read()
// set object as REOPENING
queue_work(fscache_wq, &info->ondemand_work)
// close /dev/cachefiles
cachefiles_daemon_release
cachefiles_flush_reqs
complete(&req->done)
// read req X is completed // umount the erofs fs cachefilesputobject() // object will be freed cachefilesondemanddeinitobjinfo() kmemcachefree(object) // both info and object are freed ondemandobjectworker()
When dropping an object, it is no longer necessary to reopen the object, so use cancelworksync() to cancel or wait for ondemandobjectworker() to finish.(CVE-2024-41051)
In the Linux kernel, the following vulnerability has been resolved:
scsi: ufs: core: Fix ufshcdabortone racing issue
When ufshcdabortone is racing with the completion ISR, the completed tag of the request's mqhctx pointer will be set to NULL by ISR. Return success when request is completed by ISR because ufshcdabort_one does not need to do anything.
The racing flow is:
Thread A ufshcderrhandler step 1 ... ufshcdabortone ufshcdtrytoaborttask ufshcdcmdinflight(true) step 3 ufshcdmcqreqtohwq blkmquniquetag rq->mqhctx->queue_num step 5
Thread B ufsmtkmcqintr(cq complete ISR) step 2 scsidone ... _blkmqfreerequest rq->mq_hctx = NULL; step 4
Below is KE back trace. ufshcdtrytoaborttask: cmd at tag 41 not pending in the device. ufshcdtrytoaborttask: cmd at tag=41 is cleared. Aborting tag 41 / CDB 0x28 succeeded Unable to handle kernel NULL pointer dereference at virtual address 0000000000000194 pc : [0xffffffddd7a79bf8] blkmquniquetag+0x8/0x14 lr : [0xffffffddd6155b84] ufshcdmcqreqtohwq+0x1c/0x40 [ufsmediatekmodise] domemabort+0x58/0x118 el1abort+0x3c/0x5c el1h64synchandler+0x54/0x90 el1h64sync+0x68/0x6c blkmquniquetag+0x8/0x14 ufshcderrhandler+0xae4/0xfa8 [ufsmediatekmodise] processonework+0x208/0x4fc workerthread+0x228/0x438 kthread+0x104/0x1d4 retfrom_fork+0x10/0x20(CVE-2024-41053)
In the Linux kernel, the following vulnerability has been resolved:
scsi: ufs: core: Fix ufshcdclearcmd racing issue
When ufshcdclearcmd is racing with the completion ISR, the completed tag of the request's mqhctx pointer will be set to NULL by the ISR. And ufshcdclearcmd's call to ufshcdmcqreqto_hwq will get NULL pointer KE. Return success when the request is completed by ISR because sq does not need cleanup.
The racing flow is:
Thread A ufshcderrhandler step 1 ufshcdtrytoaborttask ufshcdcmdinflight(true) step 3 ufshcdclearcmd ... ufshcdmcqreqtohwq blkmquniquetag rq->mqhctx->queue_num step 5
Thread B ufsmtkmcqintr(cq complete ISR) step 2 scsidone ... _blkmqfreerequest rq->mq_hctx = NULL; step 4
Below is KE back trace:
ufshcdtrytoaborttask: cmd pending in the device. tag = 6 Unable to handle kernel NULL pointer dereference at virtual address 0000000000000194 pc : [0xffffffd589679bf8] blkmquniquetag+0x8/0x14 lr : [0xffffffd5862f95b4] ufshcdmcqsqcleanup+0x6c/0x1cc [ufsmediatekmodise] Workqueue: ufsehwq0 ufshcderrhandler [ufsmediatekmodise] Call trace: dumpbacktrace+0xf8/0x148 showstack+0x18/0x24 dumpstacklvl+0x60/0x7c dumpstack+0x18/0x3c mrdumpcommondie+0x24c/0x398 [mrdump] ipanicdie+0x20/0x34 [mrdump] notifydie+0x80/0xd8 die+0x94/0x2b8 _dokernelfault+0x264/0x298 dopagefault+0xa4/0x4b8 dotranslationfault+0x38/0x54 domemabort+0x58/0x118 el1abort+0x3c/0x5c el1h64synchandler+0x54/0x90 el1h64sync+0x68/0x6c blkmquniquetag+0x8/0x14 ufshcdclearcmd+0x34/0x118 [ufsmediatekmodise] ufshcdtrytoaborttask+0x2c8/0x5b4 [ufsmediatekmodise] ufshcderrhandler+0xa7c/0xfa8 [ufsmediatekmodise] processonework+0x208/0x4fc workerthread+0x228/0x438 kthread+0x104/0x1d4 retfromfork+0x10/0x20(CVE-2024-41054)
In the Linux kernel, the following vulnerability has been resolved:
cachefiles: fix slab-use-after-free in fscachewithdrawvolume()
We got the following issue in our fault injection stress test:
================================================================== BUG: KASAN: slab-use-after-free in fscachewithdrawvolume+0x2e1/0x370 Read of size 4 at addr ffff88810680be08 by task ondemand-04-dae/5798
CPU: 0 PID: 5798 Comm: ondemand-04-dae Not tainted 6.8.0-dirty #565 Call Trace: kasancheckrange+0xf6/0x1b0 fscachewithdrawvolume+0x2e1/0x370 cachefileswithdrawvolume+0x31/0x50 cachefileswithdrawcache+0x3ad/0x900 cachefilesputunbindpincount+0x1f6/0x250 cachefilesdaemonrelease+0x13b/0x290 _fput+0x204/0xa00 taskworkrun+0x139/0x230
Allocated by task 5820: _kmalloc+0x1df/0x4b0 fscacheallocvolume+0x70/0x600 _fscacheacquirevolume+0x1c/0x610 erofsfscacheregistervolume+0x96/0x1a0 erofsfscacheregisterfs+0x49a/0x690 erofsfcfillsuper+0x6c0/0xcc0 vfsgetsuper+0xa9/0x140 vfsgettree+0x8e/0x300 donew_mount+0x28c/0x580 [...]
Freed by task 5820: kfree+0xf1/0x2c0 fscacheputvolume.part.0+0x5cb/0x9e0 erofsfscacheunregisterfs+0x157/0x1b0 erofskillsb+0xd9/0x1c0 deactivatelockedsuper+0xa3/0x100 vfsgetsuper+0x105/0x140 vfsgettree+0x8e/0x300 donew_mount+0x28c/0x580
Following is the process that triggers the issue:
deactivatelockedsuper cachefilesdaemonrelease erofskillsb erofsfscacheunregisterfs fscacherelinquishvolume _fscacherelinquishvolume fscacheputvolume(fscachevolume, fscachevolumeputrelinquish) zero = _refcountdecandtest(&fscachevolume->ref, &ref); cachefilesputunbindpincount cachefilesdaemonunbind cachefileswithdrawcache cachefileswithdrawvolumes listdelinit(&volume->cachelink) fscachefreevolume(fscachevolume) cache->ops->freevolume cachefilesfreevolume listdelinit(&cachefilesvolume->cachelink); kfree(fscachevolume) cachefileswithdrawvolume fscachewithdrawvolume fscachevolume->naccesses // fscache_volume UAF !!!
The fscachevolume in cache->volumes must not have been freed yet, but its reference count may be 0. So use the new fscachetrygetvolume() helper function try to get its reference count.
If the reference count of fscachevolume is 0, fscacheput_volume() is freeing it, so wait for it to be removed from cache->volumes.
If its reference count is not 0, call cachefileswithdrawvolume() with reference count protection to avoid the above issue.(CVE-2024-41058)
In the Linux kernel, the following vulnerability has been resolved:
hfsplus: fix uninit-value in copy_name
[syzbot reported] BUG: KMSAN: uninit-value in sizedstrscpy+0xc4/0x160 sizedstrscpy+0xc4/0x160 copyname+0x2af/0x320 fs/hfsplus/xattr.c:411 hfspluslistxattr+0x11e9/0x1a50 fs/hfsplus/xattr.c:750 vfslistxattr fs/xattr.c:493 [inline] listxattr+0x1f3/0x6b0 fs/xattr.c:840 pathlistxattr fs/xattr.c:864 [inline] _dosyslistxattr fs/xattr.c:876 [inline] _sesyslistxattr fs/xattr.c:873 [inline] _x64syslistxattr+0x16b/0x2f0 fs/xattr.c:873 x64syscall+0x2ba0/0x3b50 arch/x86/include/generated/asm/syscalls64.h:195 dosyscallx64 arch/x86/entry/common.c:52 [inline] dosyscall64+0xcf/0x1e0 arch/x86/entry/common.c:83 entrySYSCALL64afterhwframe+0x77/0x7f
Uninit was created at: slabpostallochook mm/slub.c:3877 [inline] slaballocnode mm/slub.c:3918 [inline] kmalloctrace+0x57b/0xbe0 mm/slub.c:4065 kmalloc include/linux/slab.h:628 [inline] hfspluslistxattr+0x4cc/0x1a50 fs/hfsplus/xattr.c:699 vfslistxattr fs/xattr.c:493 [inline] listxattr+0x1f3/0x6b0 fs/xattr.c:840 pathlistxattr fs/xattr.c:864 [inline] _dosyslistxattr fs/xattr.c:876 [inline] _sesyslistxattr fs/xattr.c:873 [inline] _x64syslistxattr+0x16b/0x2f0 fs/xattr.c:873 x64syscall+0x2ba0/0x3b50 arch/x86/include/generated/asm/syscalls64.h:195 dosyscallx64 arch/x86/entry/common.c:52 [inline] dosyscall64+0xcf/0x1e0 arch/x86/entry/common.c:83 entrySYSCALL64after_hwframe+0x77/0x7f [Fix] When allocating memory to strbuf, initialize memory to 0.(CVE-2024-41059)
In the Linux kernel, the following vulnerability has been resolved:
drm/radeon: check bo_va->bo is non-NULL before using it
The call to radeonvmclearfreed might clear bova->bo, so we have to check it before dereferencing it.(CVE-2024-41060)
In the Linux kernel, the following vulnerability has been resolved:
ibmvnic: Add tx check to prevent skb leak
Below is a summary of how the driver stores a reference to an skb during transmit: txbuff[freemap[consumerindex]]->skb = newskb; freemap[consumerindex] = IBMVNICINVALIDMAP; consumerindex ++; Where variable data looks like this: freemap == [4, IBMVNICINVALIDMAP, IBMVNICINVALIDMAP, 0, 3] consumerindex^ txbuff == [skb=null, skb=<ptr>, skb=<ptr>, skb=null, skb=null]
The driver has checks to ensure that freemap[consumerindex] pointed to a valid index but there was no check to ensure that this index pointed to an unused/null skb address. So, if, by some chance, our freemap and txbuff lists become out of sync then we were previously risking an skb memory leak. This could then cause tcp congestion control to stop sending packets, eventually leading to ETIMEDOUT.
Therefore, add a conditional to ensure that the skb address is null. If not then warn the user (because this is still a bug that should be patched) and free the old pointer to prevent memleak/tcp problems.(CVE-2024-41066)
In the Linux kernel, the following vulnerability has been resolved:
s390/sclp: Fix sclp_init() cleanup on failure
If sclpinit() fails it only partially cleans up: if there are multiple failing calls to sclpinit() sclpstatechangeevent will be added several times to sclpreg_list, which results in the following warning:
------------[ cut here ]------------ listadd double add: new=000003ffe1598c10, prev=000003ffe1598bf0, next=000003ffe1598c10. WARNING: CPU: 0 PID: 1 at lib/listdebug.c:35 listaddvalidorreport+0xde/0xf8 CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.10.0-rc3 Krnl PSW : 0404c00180000000 000003ffe0d6076a (listaddvalidorreport+0xe2/0xf8) R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3 ... Call Trace: [<000003ffe0d6076a>] _listaddvalidorreport+0xe2/0xf8 ([<000003ffe0d60766>] _listaddvalidorreport+0xde/0xf8) [<000003ffe0a8d37e>] sclpinit+0x40e/0x450 [<000003ffe00009f2>] dooneinitcall+0x42/0x1e0 [<000003ffe15b77a6>] doinitcalls+0x126/0x150 [<000003ffe15b7a0a>] kernelinitfreeable+0x1ba/0x1f8 [<000003ffe0d6650e>] kernelinit+0x2e/0x180 [<000003ffe000301c>] _retfromfork+0x3c/0x60 [<000003ffe0d759ca>] retfromfork+0xa/0x30
Fix this by removing sclpstatechangeevent from sclpreglist when sclpinit() fails.(CVE-2024-41068)
In the Linux kernel, the following vulnerability has been resolved:
cxl/region: Avoid null pointer dereference in region lookup
cxldpato_region() looks up a region based on a memdev and DPA. It wrongly assumes an endpoint found mapping the DPA is also of a fully assembled region. When not true it leads to a null pointer dereference looking up the region name.
This appears during testing of region lookup after a failure to assemble a BIOS defined region or if the lookup raced with the assembly of the BIOS defined region.
Failure to clean up BIOS defined regions that fail assembly is an issue in itself and a fix to that problem will alleviate some of the impact. It will not alleviate the race condition so let's harden this path.
The behavior change is that the kernel oops due to a null pointer dereference is replaced with a dev_dbg() message noting that an endpoint was mapped.
Additional comments are added so that future users of this function can more clearly understand what it provides.(CVE-2024-41084)
In the Linux kernel, the following vulnerability has been resolved:
ata: libata-core: Fix double free on error
If e.g. the ataportalloc() call in atahostalloc() fails, we will jump to the errout label, which will call devresreleasegroup(). devresreleasegroup() will trigger a call to atahostrelease(). atahostrelease() calls kfree(host), so executing the kfree(host) in atahost_alloc() will lead to a double free:
kernel BUG at mm/slub.c:553! Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI CPU: 11 PID: 599 Comm: (udev-worker) Not tainted 6.10.0-rc5 #47 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-2.fc40 04/01/2014 RIP: 0010:kfree+0x2cf/0x2f0 Code: 5d 41 5e 41 5f 5d e9 80 d6 ff ff 4d 89 f1 41 b8 01 00 00 00 48 89 d9 48 89 da RSP: 0018:ffffc90000f377f0 EFLAGS: 00010246 RAX: ffff888112b1f2c0 RBX: ffff888112b1f2c0 RCX: ffff888112b1f320 RDX: 000000000000400b RSI: ffffffffc02c9de5 RDI: ffff888112b1f2c0 RBP: ffffc90000f37830 R08: 0000000000000000 R09: 0000000000000000 R10: ffffc90000f37610 R11: 617461203a736b6e R12: ffffea00044ac780 R13: ffff888100046400 R14: ffffffffc02c9de5 R15: 0000000000000006 FS: 00007f2f1cabe980(0000) GS:ffff88813b380000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f2f1c3acf75 CR3: 0000000111724000 CR4: 0000000000750ef0 PKRU: 55555554 Call Trace: <TASK> ? _diebody.cold+0x19/0x27 ? die+0x2e/0x50 ? dotrap+0xca/0x110 ? doerrortrap+0x6a/0x90 ? kfree+0x2cf/0x2f0 ? excinvalidop+0x50/0x70 ? kfree+0x2cf/0x2f0 ? asmexcinvalidop+0x1a/0x20 ? atahostalloc+0xf5/0x120 [libata] ? atahostalloc+0xf5/0x120 [libata] ? kfree+0x2cf/0x2f0 atahostalloc+0xf5/0x120 [libata] atahostallocpinfo+0x14/0xa0 [libata] ahciinit_one+0x6c9/0xd20 [ahci]
Ensure that we will not call kfree(host) twice, by performing the kfree() only if the devresopengroup() call failed.(CVE-2024-41087)
In the Linux kernel, the following vulnerability has been resolved:
can: mcp251xfd: fix infinite loop when xmit fails
When the mcp251xfdstartxmit() function fails, the driver stops processing messages, and the interrupt routine does not return, running indefinitely even after killing the running application.
Error messages: [ 441.298819] mcp251xfd spi2.0 can0: ERROR in mcp251xfdstartxmit: -16 [ 441.306498] mcp251xfd spi2.0 can0: Transmit Event FIFO buffer not empty. (seq=0x000017c7, teftail=0x000017cf, tefhead=0x000017d0, tx_head=0x000017d3). ... and repeat forever.
The issue can be triggered when multiple devices share the same SPI interface. And there is concurrent access to the bus.
The problem occurs because txring->head increments even if mcp251xfdstartxmit() fails. Consequently, the driver skips one TX package while still expecting a response in mcp251xfdhandletefifone().
Resolve the issue by starting a workqueue to write the tx obj synchronously if err = -EBUSY. In case of another error, decrement tx_ring->head, remove skb from the echo stack, and drop the message.
mkl: use more imperative wording in patch description
In the Linux kernel, the following vulnerability has been resolved:
drm/i915/gt: Fix potential UAF by revoke of fence registers
CI has been sporadically reporting the following issue triggered by igt@i915_selftest@live@hangcheck on ADL-P and similar machines:
<6> [414.049203] i915: Running intelhangcheckliveselftests/igtresetevictfence ... <6> [414.068804] i915 0000:00:02.0: [drm] GT0: GUC: submission enabled <6> [414.068812] i915 0000:00:02.0: [drm] GT0: GUC: SLPC enabled <3> [414.070354] Unable to pin Y-tiled fence; err:-4 <3> [414.071282] i915vmarevokefence:301 GEMBUGON(!i915activeisidle(&fence->active)) ... <4>[ 609.603992] ------------[ cut here ]------------ <2>[ 609.603995] kernel BUG at drivers/gpu/drm/i915/gt/intelggttfencing.c:301! <4>[ 609.604003] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI <4>[ 609.604006] CPU: 0 PID: 268 Comm: kworker/u64:3 Tainted: G U W 6.9.0-CIDRM14785-g1ba62f8cea9c+ #1 <4>[ 609.604008] Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-P DDR4 RVP, BIOS RPLPFWI1.R00.4035.A00.2301200723 01/20/2023 <4>[ 609.604010] Workqueue: i915 _i915gemfreework [i915] <4>[ 609.604149] RIP: 0010:i915vmarevokefence+0x187/0x1f0 [i915] ... <4>[ 609.604271] Call Trace: <4>[ 609.604273] <TASK> ... <4>[ 609.604716] _i915vmaevict+0x2e9/0x550 [i915] <4>[ 609.604852] _i915vmaunbind+0x7c/0x160 [i915] <4>[ 609.604977] forceunbind+0x24/0xa0 [i915] <4>[ 609.605098] i915vmadestroy+0x2f/0xa0 [i915] <4>[ 609.605210] _i915gemobjectpagesfini+0x51/0x2f0 [i915] <4>[ 609.605330] _i915gemfreeobjects.isra.0+0x6a/0xc0 [i915] <4>[ 609.605440] processscheduled_works+0x351/0x690 ...
In the past, there were similar failures reported by CI from other IGT tests, observed on other platforms.
Before commit 63baf4f3d587 ("drm/i915/gt: Only wait for GPU activity before unbinding a GGTT fence"), i915vmarevokefence() was waiting for idleness of vma->active via fenceupdate(). That commit introduced vma->fence->active in order for the fenceupdate() to be able to wait selectively on that one instead of vma->active since only idleness of fence registers was needed. But then, another commit 0d86ee35097a ("drm/i915/gt: Make fence revocation unequivocal") replaced the call to fenceupdate() in i915vmarevokefence() with only fencewrite(), and also added that GEMBUGON(!i915activeis_idle(&fence->active)) in front. No justification was provided on why we might then expect idleness of vma->fence->active without first waiting on it.
The issue can be potentially caused by a race among revocation of fence registers on one side and sequential execution of signal callbacks invoked on completion of a request that was using them on the other, still processed in parallel to revocation of those fence registers. Fix it by waiting for idleness of vma->fence->active in i915vmarevoke_fence().
(cherry picked from commit 24bb052d3dd499c5956abad5f7d8e4fd07da7fb1)(CVE-2024-41092)
In the Linux kernel, the following vulnerability has been resolved:
drm/amdgpu: avoid using null object of framebuffer
Instead of using state->fb->obj[0] directly, get object from framebuffer by calling drmgemfbgetobj() and return error code when object is null to avoid using null object of framebuffer.(CVE-2024-41093)
In the Linux kernel, the following vulnerability has been resolved:
drm/fbdev-dma: Only set smem_start is enable per module option
Only export struct fbinfo.fix.smemstart if that is required by the user and the memory does not come from vmalloc().
Setting struct fbinfo.fix.smemstart breaks systems where DMA memory is backed by vmalloc address space. An example error is shown below.
[ 3.536043] ------------[ cut here ]------------ [ 3.540716] virttophys used for non-linear address: 000000007fc4f540 (0xffff800086001000) [ 3.552628] WARNING: CPU: 4 PID: 61 at arch/arm64/mm/physaddr.c:12 _virttophys+0x68/0x98 [ 3.565455] Modules linked in: [ 3.568525] CPU: 4 PID: 61 Comm: kworker/u12:5 Not tainted 6.6.23-06226-g4986cc3e1b75-dirty #250 [ 3.577310] Hardware name: NXP i.MX95 19X19 board (DT) [ 3.582452] Workqueue: eventsunbound deferredprobeworkfunc [ 3.588291] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 3.595233] pc : _virttophys+0x68/0x98 [ 3.599246] lr : _virttophys+0x68/0x98 [ 3.603276] sp : ffff800083603990 [ 3.677939] Call trace: [ 3.680393] _virttophys+0x68/0x98 [ 3.684067] drmfbdevdmahelperfbprobe+0x138/0x238 [ 3.689214] _drmfbhelperinitialconfigandunlock+0x2b0/0x4c0 [ 3.695385] drmfbhelperinitialconfig+0x4c/0x68 [ 3.700264] drmfbdevdmaclienthotplug+0x8c/0xe0 [ 3.705161] drmclientregister+0x60/0xb0 [ 3.709269] drmfbdevdma_setup+0x94/0x148
Additionally, DMA memory is assumed to by contiguous in physical address space, which is not guaranteed by vmalloc().
Resolve this by checking the module flag drmleakfbdevsmem when DRM allocated the instance of struct fbinfo. Fbdev-dma then only sets smemstart only if required (via FBINFOHIDESMEMSTART). Also guarantee that the framebuffer is not located in vmalloc address space.(CVE-2024-41094)
In the Linux kernel, the following vulnerability has been resolved:
bpf: Mark bpf prog stack with kmsanunposionmemory in interpreter mode
syzbot reported uninit memory usages during map{lookup,delete}elem.
========== BUG: KMSAN: uninit-value in devmaplookupelem kernel/bpf/devmap.c:441 [inline] BUG: KMSAN: uninit-value in devmaplookupelem+0xf3/0x170 kernel/bpf/devmap.c:796 _devmaplookupelem kernel/bpf/devmap.c:441 [inline] devmaplookupelem+0xf3/0x170 kernel/bpf/devmap.c:796 _bpfmaplookupelem kernel/bpf/helpers.c:42 [inline] bpfmaplookupelem+0x5c/0x80 kernel/bpf/helpers.c:38 _bpfprogrun+0x13fe/0xe0f0 kernel/bpf/core.c:1997
The reproducer should be in the interpreter mode.
The C reproducer is trying to run the following bpf prog:
0: (18) r0 = 0x0
2: (18) r1 = map[id:49]
4: (b7) r8 = 16777216
5: (7b) *(u64 *)(r10 -8) = r8
6: (bf) r2 = r10
7: (07) r2 += -229
^^^^^^^^^^
8: (b7) r3 = 8
9: (b7) r4 = 0
10: (85) call devmaplookup_elem#1543472 11: (95) exit
It is due to the "void *key" (r2) passed to the helper. bpf allows uninit stack memory access for bpf prog with the right privileges. This patch uses kmsanunpoisonmemory() to mark the stack as initialized.
This should address different syzbot reports on the uninit "void *key" argument during map{lookup,delete}elem.(CVE-2024-42063)
In the Linux kernel, the following vulnerability has been resolved:
net: mana: Fix possible double free in error handling path
When auxiliarydeviceadd() returns error and then calls auxiliarydeviceuninit(), callback function adev_release calls kfree(madev). We shouldn't call kfree(madev) again in the error handling path. Set 'madev' to NULL.(CVE-2024-42069)
In the Linux kernel, the following vulnerability has been resolved:
netfilter: nftables: fully validate NFTDATA_VALUE on store to data registers
register store validation for NFTDATAVALUE is conditional, however, the datatype is always either NFTDATAVALUE or NFTDATAVERDICT. This only requires a new helper function to infer the register type from the set datatype so this conditional check can be removed. Otherwise, pointer to chain object can be leaked through the registers.(CVE-2024-42070)
In the Linux kernel, the following vulnerability has been resolved:
mlxsw: spectrum_buffers: Fix memory corruptions on Spectrum-4 systems
The following two shared buffer operations make use of the Shared Buffer Status Register (SBSR):
# devlink sb occupancy snapshot pci/0000:01:00.0 # devlink sb occupancy clearmax pci/0000:01:00.0
The register has two masks of 256 bits to denote on which ingress / egress ports the register should operate on. Spectrum-4 has more than 256 ports, so the register was extended by cited commit with a new 'port_page' field.
However, when filling the register's payload, the driver specifies the ports as absolute numbers and not relative to the first port of the port page, resulting in memory corruptions [1].
Fix by specifying the ports relative to the first port of the port page.
[1] BUG: KASAN: slab-use-after-free in mlxswspsboccsnapshot+0xb6d/0xbc0 Read of size 1 at addr ffff8881068cb00f by task devlink/1566 [...] Call Trace: <TASK> dumpstacklvl+0xc6/0x120 printreport+0xce/0x670 kasanreport+0xd7/0x110 mlxswspsboccsnapshot+0xb6d/0xbc0 mlxswdevlinksboccsnapshot+0x75/0xb0 devlinknlsboccsnapshotdoit+0x1f9/0x2a0 genlfamilyrcvmsgdoit+0x20c/0x300 genlrcvmsg+0x567/0x800 netlinkrcvskb+0x170/0x450 genlrcv+0x2d/0x40 netlinkunicast+0x547/0x830 netlinksendmsg+0x8d4/0xdb0 _syssendto+0x49b/0x510 _x64syssendto+0xe5/0x1c0 dosyscall64+0xc1/0x1d0 entrySYSCALL64afterhwframe+0x77/0x7f [...] Allocated by task 1: kasansavestack+0x33/0x60 kasansavetrack+0x14/0x30 _kasankmalloc+0x8f/0xa0 copyverifierstate+0xbc2/0xfb0 docheckcommon+0x2c51/0xc7e0 bpfcheck+0x5107/0x9960 bpfprogload+0xf0e/0x2690 _sysbpf+0x1a61/0x49d0 _x64sysbpf+0x7d/0xc0 dosyscall64+0xc1/0x1d0 entrySYSCALL64after_hwframe+0x77/0x7f
Freed by task 1: kasansavestack+0x33/0x60 kasansavetrack+0x14/0x30 kasansavefreeinfo+0x3b/0x60 poisonslabobject+0x109/0x170 _kasanslabfree+0x14/0x30 kfree+0xca/0x2b0 freeverifierstate+0xce/0x270 docheckcommon+0x4828/0xc7e0 bpfcheck+0x5107/0x9960 bpfprogload+0xf0e/0x2690 _sysbpf+0x1a61/0x49d0 _x64sysbpf+0x7d/0xc0 dosyscall64+0xc1/0x1d0 entrySYSCALL64afterhwframe+0x77/0x7f(CVE-2024-42073)
In the Linux kernel, the following vulnerability has been resolved:
ASoC: amd: acp: add a null check for chip_pdev structure
When acp platform device creation is skipped, chip->chippdev value will remain NULL. Add NULL check for chip->chippdev structure in sndacpresume() function to avoid null pointer dereference.(CVE-2024-42074)
In the Linux kernel, the following vulnerability has been resolved:
gfs2: Fix NULL pointer dereference in gfs2logflush
In gfs2jindexfree(), set sdp->sdjdesc to NULL under the log flush lock to provide exclusion against gfs2log_flush().
In gfs2logflush(), check if sdp->sdjdesc is non-NULL before dereferencing it. Otherwise, we could run into a NULL pointer dereference when outstanding glock work races with an unmount (glockworkfunc -> runqueue -> doxmote -> inodegosync -> gfs2log_flush).(CVE-2024-42079)
In the Linux kernel, the following vulnerability has been resolved:
usb: dwc3: core: remove lock of otg mode during gadget suspend/resume to avoid deadlock
When config CONFIGUSBDWC3DUALROLE is selected, and trigger system to enter suspend status with below command: echo mem > /sys/power/state There will be a deadlock issue occurring. Detailed invoking path as below: dwc3suspendcommon() spinlockirqsave(&dwc->lock, flags); <-- 1st dwc3gadgetsuspend(dwc); dwc3gadgetsoftdisconnect(dwc); spinlockirqsave(&dwc->lock, flags); <-- 2nd This issue is exposed by commit c7ebd8149ee5 ("usb: dwc3: gadget: Fix NULL pointer dereference in dwc3gadgetsuspend") that removes the code of checking whether dwc->gadgetdriver is NULL or not. It causes the following code is executed and deadlock occurs when trying to get the spinlock. In fact, the root cause is the commit 5265397f9442("usb: dwc3: Remove DWC3 locking during gadget suspend/resume") that forgot to remove the lock of otg mode. So, remove the redundant lock of otg mode during gadget suspend/resume.(CVE-2024-42085)
In the Linux kernel, the following vulnerability has been resolved:
clk: sunxi-ng: common: Don't call hwtoccu_common on hw without common
In order to set the rate range of a hw sunxiccuprobe calls hwtoccucommon() assuming all entries in desc->ccuclks are contained in a ccu_common struct. This assumption is incorrect and, in consequence, causes invalid pointer de-references.
Remove the faulty call. Instead, add one more loop that iterates over the ccu_clks and sets the rate range, if required.(CVE-2024-42100)
In the Linux kernel, the following vulnerability has been resolved:
btrfs: fix adding block group to a reclaim list and the unused list during reclaim
There is a potential parallel list adding for retrying in btrfsreclaimbgs_work and adding to the unused list. Since the block group is removed from the reclaim list and it is on a relocation work, it can be added into the unused list in parallel. When that happens, adding it to the reclaim list will corrupt the list head and trigger list corruption like below.
Fix it by taking fsinfo->unusedbgs_lock.
[177.504][T2585409] BTRFS error (device nullb1): error relocating ch= unk 2415919104 [177.514][T2585409] listdel corruption. next->prev should be ff1100= 0344b119c0, but was ff11000377e87c70. (next=3Dff110002390cd9c0) [177.529][T2585409] ------------[ cut here ]------------ [177.537][T2585409] kernel BUG at lib/listdebug.c:65! [177.545][T2585409] Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI [177.555][T2585409] CPU: 9 PID: 2585409 Comm: kworker/u128:2 Tainted: G W 6.10.0-rc5-kts #1 [177.568][T2585409] Hardware name: Supermicro SYS-520P-WTR/X12SPW-TF, BIOS 1.2 02/14/2022 [177.579][T2585409] Workqueue: eventsunbound btrfsreclaimbgswork[btrfs] [177.589][T2585409] RIP: 0010:_listdelentryvalidorreport.cold+0x70/0x72 [177.624][T2585409] RSP: 0018:ff11000377e87a70 EFLAGS: 00010286 [177.633][T2585409] RAX: 000000000000006d RBX: ff11000344b119c0 RCX:0000000000000000 [177.644][T2585409] RDX: 000000000000006d RSI: 0000000000000008 RDI:ffe21c006efd0f40 [177.655][T2585409] RBP: ff110002e0509f78 R08: 0000000000000001 R09:ffe21c006efd0f08 [177.665][T2585409] R10: ff11000377e87847 R11: 0000000000000000 R12:ff110002390cd9c0 [177.676][T2585409] R13: ff11000344b119c0 R14: ff110002e0508000 R15:dffffc0000000000 [177.687][T2585409] FS: 0000000000000000(0000) GS:ff11000fec880000(0000) knlGS:0000000000000000 [177.700][T2585409] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [177.709][T2585409] CR2: 00007f06bc7b1978 CR3: 0000001021e86005 CR4:0000000000771ef0 [177.720][T2585409] DR0: 0000000000000000 DR1: 0000000000000000 DR2:0000000000000000 [177.731][T2585409] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:0000000000000400 [177.742][T2585409] PKRU: 55555554 [177.748][T2585409] Call Trace: [177.753][T2585409] <TASK> [177.759][T2585409] ? _diebody.cold+0x19/0x27 [177.766][T2585409] ? die+0x2e/0x50 [177.772][T2585409] ? dotrap+0x1ea/0x2d0 [177.779][T2585409] ? _listdelentryvalidorreport.cold+0x70/0x72 [177.788][T2585409] ? doerrortrap+0xa3/0x160 [177.795][T2585409] ? _listdelentryvalidorreport.cold+0x70/0x72 [177.805][T2585409] ? handleinvalidop+0x2c/0x40 [177.812][T2585409] ? _listdelentryvalidorreport.cold+0x70/0x72 [177.820][T2585409] ? excinvalidop+0x2d/0x40 [177.827][T2585409] ? asmexcinvalidop+0x1a/0x20 [177.834][T2585409] ? _listdelentryvalidorreport.cold+0x70/0x72 [177.843][T2585409] btrfsdeleteunused_bgs+0x3d9/0x14c0 [btrfs]
There is a similar retrylist code in btrfsdeleteunusedbgs(), but it is safe, AFAICS. Since the block group was in the unused list, the used bytes should be 0 when it was added to the unused list. Then, it checks blockgroup->{used,reserved,pinned} are still 0 under the blockgroup->lock. So, they should be still eligible for the unused list, not the reclaim list.
The reason it is safe there it's because because we're holding spaceinfo->groupssem in write mode.
That means no other task can allocate from the block group, so while we are at deletedunusedbgs() it's not possible for other tasks to allocate and deallocate extents from the block group, so it can't be added to the unused list or the reclaim list by anyone else.
The bug can be reproduced by btrfs/166 after a few rounds. In practice this can be hit when relocation cannot find more chunk space and ends with ENOSPC.(CVE-2024-42103)
In the Linux kernel, the following vulnerability has been resolved:
netfilter: nf_tables: unconditionally flush pending work before notifier
syzbot reports:
KASAN: slab-uaf in nftctxupdate include/net/netfilter/nftables.h:1831 KASAN: slab-uaf in nftcommitrelease net/netfilter/nftablesapi.c:9530 KASAN: slab-uaf int nftablestransdestroywork+0x152b/0x1750 net/netfilter/nftablesapi.c:9597 Read of size 2 at addr ffff88802b0051c4 by task kworker/1:1/45 [..] Workqueue: events nftablestransdestroywork Call Trace: nftctxupdate include/net/netfilter/nftables.h:1831 [inline] nftcommitrelease net/netfilter/nftablesapi.c:9530 [inline] nftablestransdestroywork+0x152b/0x1750 net/netfilter/nftablesapi.c:9597
Problem is that the notifier does a conditional flush, but its possible that the table-to-be-removed is still referenced by transactions being processed by the worker, so we need to flush unconditionally.
We could make the flush_work depend on whether we found a table to delete in nf-next to avoid the flush for most cases.
AFAICS this problem is only exposed in nf-next, with commit e169285f8c56 ("netfilter: nftables: do not store nftctx in transaction objects"), with this commit applied there is an unconditional fetch of table->family which is whats triggering the above splat.(CVE-2024-42109)
In the Linux kernel, the following vulnerability has been resolved:
net: txgbe: initialize numqvectors for MSI/INTx interrupts
When using MSI/INTx interrupts, wx->numqvectors is uninitialized. Thus there will be kernel panic in wxallocq_vectors() to allocate queue vectors.(CVE-2024-42113)
In the Linux kernel, the following vulnerability has been resolved:
drm/amd/display: Check pipe offset before setting vblank
pipectx has a size of MAXPIPES so checking its index before accessing the array.
This fixes an OVERRUN issue reported by Coverity.(CVE-2024-42120)
In the Linux kernel, the following vulnerability has been resolved:
drm/amd/display: Check index msg_id before read or write
[WHAT] msgid is used as an array index and it cannot be a negative value, and therefore cannot be equal to MODHDCPMESSAGEID_INVALID (-1).
[HOW] Check whether msg_id is valid before reading and setting.
This fixes 4 OVERRUN issues reported by Coverity.(CVE-2024-42121)
In the Linux kernel, the following vulnerability has been resolved:
nfc/nci: Add the inconsistency check between the input data length and count
write$nci(r0, &(0x7f0000000740)=ANY=[@ANYBLOB="610501"], 0xf)
Syzbot constructed a write() call with a data length of 3 bytes but a count value of 15, which passed too little data to meet the basic requirements of the function ncirfintfactivatedntf_packet().
Therefore, increasing the comparison between data length and count value to avoid problems caused by inconsistent data length and count.(CVE-2024-42130)
In the Linux kernel, the following vulnerability has been resolved:
bluetooth/hci: disallow setting handle bigger than HCICONNHANDLE_MAX
Syzbot hit warning in hciconndel() caused by freeing handle that was not allocated using ida allocator.
This is caused by handle bigger than HCICONNHANDLEMAX passed by hcilebigsyncestablishedevt(), which makes code think it's unset connection.
Add same check for handle upper bound as in hciconnset_handle() to prevent warning.(CVE-2024-42132)
In the Linux kernel, the following vulnerability has been resolved:
Bluetooth: Ignore too large handle values in BIG
hcilebigsyncestablishedevt is necessary to filter out cases where the handle value is belonging to ida id range, otherwise ida will be erroneously released in hciconn_cleanup.(CVE-2024-42133)
In the Linux kernel, the following vulnerability has been resolved:
vhost_task: Handle SIGKILL by flushing work and exiting
Instead of lingering until the device is closed, this has us handle SIGKILL by:
In the Linux kernel, the following vulnerability has been resolved:
cdrom: rearrange lastmediachange check to avoid unintentional overflow
When running syzkaller with the newly reintroduced signed integer wrap sanitizer we encounter this splat:
[ 366.015950] UBSAN: signed-integer-overflow in ../drivers/cdrom/cdrom.c:2361:33 [ 366.021089] -9223372036854775808 - 346321 cannot be represented in type '_s64' (aka 'long long') [ 366.025894] program syz-executor.4 is using a deprecated SCSI ioctl, please convert it to SGIO [ 366.027502] CPU: 5 PID: 28472 Comm: syz-executor.7 Not tainted 6.8.0-rc2-00035-gb3ef86b5a957 #1 [ 366.027512] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 366.027518] Call Trace: [ 366.027523] <TASK> [ 366.027533] dumpstacklvl+0x93/0xd0 [ 366.027899] handleoverflow+0x171/0x1b0 [ 366.038787] ata1.00: invalid multicount 32 ignored [ 366.043924] cdromioctl+0x2c3f/0x2d10 [ 366.063932] ? _pmruntimeresume+0xe6/0x130 [ 366.071923] srblockioctl+0x15d/0x1d0 [ 366.074624] ? _pfxsrblockioctl+0x10/0x10 [ 366.077642] blkdevioctl+0x419/0x500 [ 366.080231] ? _pfxblkdevioctl+0x10/0x10 ...
Historically, the signed integer overflow sanitizer did not work in the
kernel due to its interaction with -fwrapv
but this has since been
changed [1] in the newest version of Clang. It was re-enabled in the
kernel with Commit 557f8c582a9ba8ab ("ubsan: Reintroduce signed overflow
sanitizer").
Let's rearrange the check to not perform any arithmetic, thus not tripping the sanitizer.(CVE-2024-42136)
In the Linux kernel, the following vulnerability has been resolved:
mlxsw: core_linecards: Fix double memory deallocation in case of invalid INI file
In case of invalid INI file mlxswlinecardtypesinit() deallocates memory but doesn't reset pointer to NULL and returns 0. In case of any error occurred after mlxswlinecardtypesinit() call, mlxswlinecardsinit() calls mlxswlinecardtypes_fini() which performs memory deallocation again.
Add pointer reset to NULL.
Found by Linux Verification Center (linuxtesting.org) with SVACE.(CVE-2024-42138)
In the Linux kernel, the following vulnerability has been resolved:
riscv: kexec: Avoid deadlock in kexec crash path
If the kexec crash code is called in the interrupt context, the machinekexecmaskinterrupts() function will trigger a deadlock while trying to acquire the irqdesc spinlock and then deactivate irqchip in irqsetirqchipstate() function.
Unlike arm64, riscv only requires irqeoi handler to complete EOI and keeping irqsetirqchipstate() will only leave this possible deadlock without any use. So we simply remove it.(CVE-2024-42140)
In the Linux kernel, the following vulnerability has been resolved:
net/mlx5: E-switch, Create ingress ACL when needed
Currently, ingress acl is used for three features. It is created only when vport metadata match and prio tag are enabled. But active-backup lag mode also uses it. It is independent of vport metadata match and prio tag. And vport metadata match can be disabled using the following devlink command:
# devlink dev param set pci/0000:08:00.0 name eswportmetadata \ value false cmode runtime
If ingress acl is not created, will hit panic when creating drop rule for active-backup lag mode. If always create it, there will be about 5% performance degradation.
Fix it by creating ingress acl when needed. If eswportmetadata is true, ingress acl exists, then create drop rule using existing ingress acl. If eswportmetadata is false, create ingress acl and then create drop rule.(CVE-2024-42142)
In the Linux kernel, the following vulnerability has been resolved:
thermal/drivers/mediatek/lvtsthermal: Check NULL ptr on lvtsdata
Verify that lvts_data is not NULL before using it.(CVE-2024-42144)
In the Linux kernel, the following vulnerability has been resolved:
s390/pkey: Wipe copies of clear-key structures on failure
Wipe all sensitive data from stack for all IOCTLs, which convert a clear-key into a protected- or secure-key.(CVE-2024-42156)
In the Linux kernel, the following vulnerability has been resolved:
f2fs: check validation of fault attrs in f2fsbuildfault_attr()
In the Linux kernel, the following vulnerability has been resolved:
bpf: Avoid uninitialized value in BPFCOREREAD_BITFIELD
[Changes from V1: - Use a default branch in the switch statement to initialize `val'.]
GCC warns that `val' may be used uninitialized in the BPFCREREADBITFIELD macro, defined in bpfcore_read.h as:
[...]
unsigned long long val; \
[...] \
switch (__CORE_RELO(s, field, BYTE_SIZE)) { \
case 1: val = *(const unsigned char *)p; break; \
case 2: val = *(const unsigned short *)p; break; \
case 4: val = *(const unsigned int *)p; break; \
case 8: val = *(const unsigned long long *)p; break; \
} \
[...]
val; \
} \
This patch adds a default entry in the switch statement that sets `val' to zero in order to avoid the warning, and random values to be used in case _builtinpreservefieldinfo returns unexpected values for BPFFIELDBYTE_SIZE.
Tested in bpf-next master. No regressions.(CVE-2024-42161)
In the Linux kernel, the following vulnerability has been resolved:
net: dsa: mv88e6xxx: Correct check for empty list
Since commit a3c53be55c95 ("net: dsa: mv88e6xxx: Support multiple MDIO busses") mv88e6xxxdefaultmdiobus() has checked that the return value of listfirst_entry() is non-NULL.
This appears to be intended to guard against the list chip->mdios being empty. However, it is not the correct check as the implementation of listfirstentry is not designed to return NULL for empty lists.
Instead, use listfirstentryornull() which does return NULL if the list is empty.
Flagged by Smatch. Compile tested only.(CVE-2024-42224)
In the Linux kernel, the following vulnerability has been resolved:
wifi: mt76: replace skbput with skbput_zero
Avoid potentially reusing uninitialized data(CVE-2024-42225)
In the Linux kernel, the following vulnerability has been resolved:
powerpc/pseries: Fix scv instruction crash with kexec
kexec on pseries disables AIL (reloconexc), required for scv instruction support, before other CPUs have been shut down. This means they can execute scv instructions after AIL is disabled, which causes an interrupt at an unexpected entry location that crashes the kernel.
Change the kexec sequence to disable AIL after other CPUs have been brought down.
As a refresher, the real-mode scv interrupt vector is 0x17000, and the fixed-location head code probably couldn't easily deal with implementing such high addresses so it was just decided not to support that interrupt at all.(CVE-2024-42230)
In the Linux kernel, the following vulnerability has been resolved:
protect the fetch of ->fd[fd] in do_dup2() from mispredictions
both callers have verified that fd is not greater than ->maxfds; however, misprediction might end up with tofree = fdt->fd[fd]; being speculatively executed. That's wrong for the same reasons why it's wrong in closefd()/fileclosefdlocked(); the same solution applies - arrayindexnospec(fd, fdt->maxfds) could differ from fd only in case of speculative execution on mispredicted path.(CVE-2024-42265)
In the Linux kernel, the following vulnerability has been resolved:
riscv/mm: Add handling for VMFAULTSIGSEGV in mmfaulterror()
Handle VMFAULTSIGSEGV in the page fault path so that we correctly kill the process and we don't BUG() the kernel.(CVE-2024-42267)
In the Linux kernel, the following vulnerability has been resolved:
net/mlx5: Fix missing lock on sync reset reload
On sync reset reload work, when remote host updates devlink on reload actions performed on that host, it misses taking devlink lock before calling devlinkremotereloadactionsperformed() which results in triggering lock assert like the following:
WARNING: CPU: 4 PID: 1164 at net/devlink/core.c:261 devlassertlocked+0x3e/0x50 … CPU: 4 PID: 1164 Comm: kworker/u96:6 Tainted: G S W 6.10.0-rc2+ #116 Hardware name: Supermicro SYS-2028TP-DECTR/X10DRT-PT, BIOS 2.0 12/18/2015 Workqueue: mlx5fwresetevents mlx5syncresetreloadwork [mlx5core] RIP: 0010:devlassertlocked+0x3e/0x50 … Call Trace: <TASK> ? _warn+0xa4/0x210 ? devlassertlocked+0x3e/0x50 ? reportbug+0x160/0x280 ? handlebug+0x3f/0x80 ? excinvalidop+0x17/0x40 ? asmexcinvalidop+0x1a/0x20 ? devlassertlocked+0x3e/0x50 devlinknotify+0x88/0x2b0 ? mlx5attachdevice+0x20c/0x230 [mlx5core] ? _pfxdevlinknotify+0x10/0x10 ? processonework+0x4b6/0xbb0 processone_work+0x4b6/0xbb0 …
In the Linux kernel, the following vulnerability has been resolved:
netfilter: iptables: Fix potential null-ptr-deref in ip6tablenattable_init().
ip6tablenattableinit() accesses net->gen->ptr[ip6tablenatnetops.id], but the function is exposed to user space before the entry is allocated via registerpernetsubsys().
Let's call registerpernetsubsys() before xtregistertemplate().(CVE-2024-42269)
In the Linux kernel, the following vulnerability has been resolved:
netfilter: iptables: Fix null-ptr-deref in iptablenattable_init().
We had a report that iptables-restore sometimes triggered null-ptr-deref at boot time. [0]
The problem is that iptablenattable_init() is exposed to user space before the kernel fully initialises netns.
In the small race window, a user could call iptablenattableinit() that accesses netgeneric(net, iptablenatnetid), which is available only after registering iptablenatnetops.
Let's call registerpernetsubsys() before xtregistertemplate().
Started bpfilter BUG: kernel NULL pointer dereference, address: 0000000000000013 PF: supervisor write access in kernel mode PF: errorcode(0x0002) - not-present page PGD 0 P4D 0 PREEMPT SMP NOPTI CPU: 2 PID: 11879 Comm: iptables-restor Not tainted 6.1.92-99.174.amzn2023.x8664 #1 Hardware name: Amazon EC2 c6i.4xlarge/, BIOS 1.0 10/16/2017 RIP: 0010:iptablenattableinit (net/ipv4/netfilter/iptablenat.c:87 net/ipv4/netfilter/iptablenat.c:121) iptablenat Code: 10 4c 89 f6 48 89 ef e8 0b 19 bb ff 41 89 c4 85 c0 75 38 41 83 c7 01 49 83 c6 28 41 83 ff 04 75 dc 48 8b 44 24 08 48 8b 0c 24 <48> 89 08 4c 89 ef e8 a2 3b a2 cf 48 83 c4 10 44 89 e0 5b 5d 41 5c RSP: 0018:ffffbef902843cd0 EFLAGS: 00010246 RAX: 0000000000000013 RBX: ffff9f4b052caa20 RCX: ffff9f4b20988d80 RDX: 0000000000000000 RSI: 0000000000000064 RDI: ffffffffc04201c0 RBP: ffff9f4b29394000 R08: ffff9f4b07f77258 R09: ffff9f4b07f77240 R10: 0000000000000000 R11: ffff9f4b09635388 R12: 0000000000000000 R13: ffff9f4b1a3c6c00 R14: ffff9f4b20988e20 R15: 0000000000000004 FS: 00007f6284340000(0000) GS:ffff9f51fe280000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000013 CR3: 00000001d10a6005 CR4: 00000000007706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> ? showtraceloglvl (arch/x86/kernel/dumpstack.c:259) ? showtraceloglvl (arch/x86/kernel/dumpstack.c:259) ? xtfindtablelock (net/netfilter/xtables.c:1259) ? _diebody.cold (arch/x86/kernel/dumpstack.c:478 arch/x86/kernel/dumpstack.c:420) ? pagefaultoops (arch/x86/mm/fault.c:727) ? excpagefault (./arch/x86/include/asm/irqflags.h:40 ./arch/x86/include/asm/irqflags.h:75 arch/x86/mm/fault.c:1470 arch/x86/mm/fault.c:1518) ? asmexcpagefault (./arch/x86/include/asm/idtentry.h:570) ? iptablenattableinit (net/ipv4/netfilter/iptablenat.c:87 net/ipv4/netfilter/iptablenat.c:121) iptablenat xtfindtablelock (net/netfilter/xtables.c:1259) xtrequestfindtablelock (net/netfilter/xtables.c:1287) getinfo (net/ipv4/netfilter/iptables.c:965) ? securitycapable (security/security.c:809 (discriminator 13)) ? nscapable (kernel/capability.c:376 kernel/capability.c:397) ? doiptgetctl (net/ipv4/netfilter/iptables.c:1656) ? bpfiltersendreq (net/bpfilter/bpfilterkern.c:52) bpfilter nfgetsockopt (net/netfilter/nfsockopt.c:116) ipgetsockopt (net/ipv4/ipsockglue.c:1827) _sysgetsockopt (net/socket.c:2327) _x64sysgetsockopt (net/socket.c:2342 net/socket.c:2339 net/socket.c:2339) dosyscall64 (arch/x86/entry/common.c:51 arch/x86/entry/common.c:81) entrySYSCALL64afterhwframe (arch/x86/entry/entry64.S:121) RIP: 0033:0x7f62844685ee Code: 48 8b 0d 45 28 0f 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 37 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 0a c3 66 0f 1f 84 00 00 00 00 00 48 8b 15 09 RSP: 002b:00007ffd1f83d638 EFLAGS: 00000246 ORIGRAX: 0000000000000037 RAX: ffffffffffffffda RBX: 00007ffd1f83d680 RCX: 00007f62844685ee RDX: 0000000000000040 RSI: 0000000000000000 RDI: 0000000000000004 RBP: 0000000000000004 R08: 00007ffd1f83d670 R09: 0000558798ffa2a0 R10: 00007ffd1f83d680 R11: 0000000000000246 R12: 00007ffd1f83e3b2 R13: 00007f6284 ---truncated---(CVE-2024-42270)
In the Linux kernel, the following vulnerability has been resolved:
f2fs: assign CURSEGALLDATA_ATGC if blkaddr is valid
mkdir /mnt/test/comp f2fs_io setflags compression /mnt/test/comp dd if=/dev/zero of=/mnt/test/comp/testfile bs=16k count=1 truncate --size 13 /mnt/test/comp/testfile
In the above scenario, we can get a BUGON. kernel BUG at fs/f2fs/segment.c:3589! Call Trace: dowritepage+0x78/0x390 [f2fs] f2fsoutplacewritedata+0x62/0xb0 [f2fs] f2fsdowritedatapage+0x275/0x740 [f2fs] f2fswritesingledatapage+0x1dc/0x8f0 [f2fs] f2fswritemultipages+0x1e5/0xae0 [f2fs] f2fswritecachepages+0xab1/0xc60 [f2fs] f2fswritedatapages+0x2d8/0x330 [f2fs] dowritepages+0xcf/0x270 _writebacksingleinode+0x44/0x350 writebacksbinodes+0x242/0x530 _writebackinodeswb+0x54/0xf0 wbwriteback+0x192/0x310 wbworkfn+0x30d/0x400
The reason is we gave CURSEGALLDATAATGC to COMPRADDR where the page was set the gcing flag by setclusterdirty().(CVE-2024-42273)
In the Linux kernel, the following vulnerability has been resolved:
Revert "ALSA: firewire-lib: operate for period elapse event in process context"
Commit 7ba5ca32fe6e ("ALSA: firewire-lib: operate for period elapse event in process context") removed the process context workqueue from amdtpdomainstreampcmpointer() and updatepcmpointers() to remove its overhead.
With RME Fireface 800, this lead to a regression since Kernels 5.14.0, causing an AB/BA deadlock competition for the substream lock with eventual system freeze under ALSA operation:
thread 0: * (lock A) acquire substream lock by sndpcmstreamlockirq() in sndpcmstatus64() * (lock B) wait for tasklet to finish by calling taskletunlockspinwait() in taskletdisableinatomic() in ohciflushiso_completions() of ohci.c
thread 1: * (lock B) enter tasklet * (lock A) attempt to acquire substream lock, waiting for it to be released: sndpcmstreamlockirqsave() in sndpcmperiodelapsed() in updatepcmpointers() in processctxpayloads() in processrx_packets() of amdtp-stream.c
? taskletunlockspinwait </NMI> <TASK> ohciflushisocompletions firewireohci amdtpdomainstreampcmpointer sndfirewirelib sndpcmupdatehwptr0 sndpcm sndpcmstatus64 snd_pcm
? nativequeuedspinlockslowpath </NMI> <IRQ> rawspinlockirqsave sndpcmperiodelapsed sndpcm processrxpackets sndfirewirelib irqtargetcallback sndfirewirelib handleitpacket firewireohci contexttasklet firewire_ohci
Restore the process context work queue to prevent deadlock AB/BA deadlock competition for ALSA substream lock of sndpcmstreamlockirq() in sndpcmstatus64() and sndpcmstreamlockirqsave() in sndpcmperiod_elapsed().
revert commit 7ba5ca32fe6e ("ALSA: firewire-lib: operate for period elapse event in process context")
Replace inline description to prevent future deadlock.(CVE-2024-42274)
In the Linux kernel, the following vulnerability has been resolved:
tipc: Return non-zero value from tipcudpaddr2str() on error
tipcudpaddr2str() should return non-zero value if the UDP media address is invalid. Otherwise, a buffer overflow access can occur in tipcmediaaddr_printf(). Fix this by returning 1 on an invalid UDP media address.(CVE-2024-42284)
In the Linux kernel, the following vulnerability has been resolved:
RDMA/iwcm: Fix a use-after-free related to destroying CM IDs
iwconnreqhandler() associates a new struct rdmaidprivate (connid) with an existing struct iwcmid (cm_id) as follows:
conn_id->cm_id.iw = cm_id;
cm_id->context = conn_id;
cm_id->cm_handler = cma_iw_handler;
rdmadestroyid() frees both the cmid and the struct rdmaidprivate. Make sure that cmworkhandler() does not trigger a use-after-free by only freeing of the struct rdmaid_private after all pending work has finished.(CVE-2024-42285)
In the Linux kernel, the following vulnerability has been resolved:
PCI/DPC: Fix use-after-free on concurrent DPC and hot-removal
Keith reports a use-after-free when a DPC event occurs concurrently to hot-removal of the same portion of the hierarchy:
The dpchandler() awaits readiness of the secondary bus below the Downstream Port where the DPC event occurred. To do so, it polls the config space of the first child device on the secondary bus. If that child device is concurrently removed, accesses to its struct pcidev cause the kernel to oops.
That's because pcibridgewaitforsecondarybus() neglects to hold a reference on the child device. Before v6.3, the function was only called on resume from system sleep or on runtime resume. Holding a reference wasn't necessary back then because the pciehp IRQ thread could never run concurrently. (On resume from system sleep, IRQs are not enabled until after the resumenoirq phase. And runtime resume is always awaited before a PCI device is removed.)
However starting with v6.3, pcibridgewaitforsecondarybus() is also called on a DPC event. Commit 53b54ad074de ("PCI/DPC: Await readiness of secondary bus after reset"), which introduced that, failed to appreciate that pcibridgewaitforsecondarybus() now needs to hold a reference on the child device because dpc_handler() and pciehp may indeed run concurrently. The commit was backported to v5.10+ stable kernels, so that's the oldest one affected.
Add the missing reference acquisition.
Abridged stack trace:
BUG: unable to handle page fault for address: 00000000091400c0 CPU: 15 PID: 2464 Comm: irq/53-pcie-dpc 6.9.0 RIP: pcibusreadconfigdword+0x17/0x50 pcidevwait() pcibridgewaitforsecondarybus() dpcresetlink() pciedorecovery() dpchandler()(CVE-2024-42302)
In the Linux kernel, the following vulnerability has been resolved:
kvm: s390: Reject memory region operations for ucontrol VMs
This change rejects the KVMSETUSERMEMORYREGION and KVMSETUSERMEMORYREGION2 ioctls when called on a ucontrol VM. This is necessary since ucontrol VMs have kvm->arch.gmap set to 0 and would thus result in a null pointer dereference further in. Memory management needs to be performed in userspace and using the ioctls KVMS390UCASMAP and KVMS390UCASUNMAP.
Also improve s390 specific documentation for KVMSETUSERMEMORYREGION and KVMSETUSERMEMORYREGION2.
frankja@linux.ibm.com: commit message spelling fix, subject prefix fix
In the Linux kernel, the following vulnerability has been resolved:
PCI: endpoint: pci-epf-test: Make use of cached 'epcfeatures' in pciepftestcore_init()
Instead of getting the epcfeatures from pciepcgetfeatures() API, use the cached pciepftest::epcfeatures value to avoid the NULL check. Since the NULL check is already performed in pciepftestbind(), having one more check in pciepftestcoreinit() is redundant and it is not possible to hit the NULL pointer dereference.
Also with commit a01e7214bef9 ("PCI: endpoint: Remove "coreinitnotifier" flag"), 'epc_features' got dereferenced without the NULL check, leading to the following false positive Smatch warning:
drivers/pci/endpoint/functions/pci-epf-test.c:784 pciepftestcoreinit() error: we previously assumed 'epc_features' could be null (see line 747)
Thus, remove the redundant NULL check and also use the epcfeatures:: {msixcapable/msi_capable} flags directly to avoid local variables.
In the Linux kernel, the following vulnerability has been resolved:
cgroup/cpuset: Prevent UAF in proccpusetshow()
An UAF can happen when /proc/cpuset is read as reported in [1].
This can be reproduced by the following methods: 1.add an mdelay(1000) before acquiring the cgrouplock In the cgrouppath_ns function. 2.$cat /proc/<pid>/cpuset repeatly. 3.$mount -t cgroup -o cpuset cpuset /sys/fs/cgroup/cpuset/ $umount /sys/fs/cgroup/cpuset/ repeatly.
The race that cause this bug can be shown as below:
(umount) | (cat /proc/<pid>/cpuset) cssrelease | proccpusetshow cssreleaseworkfn | css = taskgetcss(tsk, cpusetcgrpid); cssfreerworkfn | cgrouppathns(css->cgroup, ...); cgroupdestroyroot | mutexlock(&cgroupmutex); rebindsubsystems | cgroupfreeroot | | // cgrp was freed, UAF | cgrouppathns_locked(cgrp,..);
When the cpuset is initialized, the root node topcpuset.css.cgrp will point to &cgrpdflroot.cgrp. In cgroup v1, the mount operation will allocate cgrouproot, and topcpuset.css.cgrp will point to the allocated &cgrouproot.cgrp. When the umount operation is executed, topcpuset.css.cgrp will be rebound to &cgrpdfl_root.cgrp.
The problem is that when rebinding to cgrpdflroot, there are cases where the cgrouproot allocated by setting up the root for cgroup v1 is cached. This could lead to a Use-After-Free (UAF) if it is subsequently freed. The descendant cgroups of cgroup v1 can only be freed after the css is released. However, the css of the root will never be released, yet the cgrouproot should be freed when it is unmounted. This means that obtaining a reference to the css of the root does not guarantee that css.cgrp->root will not be freed.
Fix this problem by using rcureadlock in proccpusetshow(). As cgrouproot is kfreercu after commit d23b5c577715 ("cgroup: Make operations on the cgroup rootlist RCU safe"), css->cgroup won't be freed during the critical section. To call cgrouppathnslocked, csssetlock is needed, so it is safe to replace taskgetcss with task_css.
[1] https://syzkaller.appspot.com/bug?extid=9b1ff7be974a403aa4cd(CVE-2024-43853)
In the Linux kernel, the following vulnerability has been resolved:
net: usb: qmi_wwan: fix memory leak for not ip packets
Free the unused skb when not ip packets arrive.(CVE-2024-43861)
In the Linux kernel, the following vulnerability has been resolved:
drm/vmwgfx: Fix a deadlock in dma buf fence polling
Introduce a version of the fence ops that on release doesn't remove the fence from the pending list, and thus doesn't require a lock to fix poll->fence wait->fence unref deadlocks.
vmwgfx overwrites the wait callback to iterate over the list of all fences and update their status, to do that it holds a lock to prevent the list modifcations from other threads. The fence destroy callback both deletes the fence and removes it from the list of pending fences, for which it holds a lock.
dma buf polling cb unrefs a fence after it's been signaled: so the poll calls the wait, which signals the fences, which are being destroyed. The destruction tries to acquire the lock on the pending fences list which it can never get because it's held by the wait from which it was called.
Old bug, but not a lot of userspace apps were using dma-buf polling interfaces. Fix those, in particular this fixes KDE stalls/deadlock.(CVE-2024-43863)
In the Linux kernel, the following vulnerability has been resolved:
net/mlx5e: Fix CT entry update leaks of modify header context
The cited commit allocates a new modify header to replace the old one when updating CT entry. But if failed to allocate a new one, eg. exceed the max number firmware can support, modify header will be an error pointer that will trigger a panic when deallocating it. And the old modify header point is copied to old attr. When the old attr is freed, the old modify header is lost.
Fix it by restoring the old attr to attr when failed to allocate a new modify header context. So when the CT entry is freed, the right modify header context will be freed. And the panic of accessing error pointer is also fixed.(CVE-2024-43864)
In the Linux kernel, the following vulnerability has been resolved:
net/mlx5: Always drain health in shutdown callback
There is no point in recovery during device shutdown. if health work started need to wait for it to avoid races and NULL pointer access.
Hence, drain health WQ on shutdown callback.(CVE-2024-43866)
In the Linux kernel, the following vulnerability has been resolved:
riscv/purgatory: align riscvkernelentry
When alignment handling is delegated to the kernel, everything must be word-aligned in purgatory, since the trap handler is then set to the kexec one. Without the alignment, hitting the exception would ultimately crash. On other occasions, the kernel's handler would take care of exceptions. This has been tested on a JH7110 SoC with oreboot and its SBI delegating unaligned access exceptions and the kernel configured to handle them.(CVE-2024-43868)
In the Linux kernel, the following vulnerability has been resolved:
perf: Fix event leak upon exec and file release
The perf pending task work is never waited upon the matching event release. In the case of a child event, released via free_event() directly, this can potentially result in a leaked event, such as in the following scenario that doesn't even require a weak IRQ work implementation to trigger:
schedule() preparetaskswitch() =======> <NMI> perfeventoverflow() event->pendingsigtrap = ... irqworkqueue(&event->pendingirq) <======= </NMI> perfeventtaskschedout() eventschedout() event->pendingsigtrap = 0; atomiclongincnotzero(&event->refcount) taskworkadd(&event->pendingtask) finishlockswitch() =======> <IRQ> perfpendingirq() //do nothing, rely on pending task work <======= </IRQ>
beginnewexec() perfeventexittask() perfeventexitevent() // If is child event freeevent() WARN(atomiclong_cmpxchg(&event->refcount, 1, 0) != 1) // event is leaked
Similar scenarios can also happen with perfeventremoveonexec() or simply against concurrent perfeventrelease().
Fix this with synchonizing against the possibly remaining pending task work while freeing the event, just like is done with remaining pending IRQ work. This means that the pending task callback neither need nor should hold a reference to the event, preventing it from ever beeing freed.(CVE-2024-43869)
In the Linux kernel, the following vulnerability has been resolved:
exec: Fix ToCToU between perm check and set-uid/gid usage
When opening a file for exec via dofilpopen(), permission checking is done against the file's metadata at that moment, and on success, a file pointer is passed back. Much later in the execve() code path, the file metadata (specifically mode, uid, and gid) is used to determine if/how to set the uid and gid. However, those values may have changed since the permissions check, meaning the execution may gain unintended privileges.
For example, if a file could change permissions from executable and not set-id:
---------x 1 root root 16048 Aug 7 13:16 target
to set-id and non-executable:
---S------ 1 root root 16048 Aug 7 13:16 target
it is possible to gain root privileges when execution should have been disallowed.
While this race condition is rare in real-world scenarios, it has been observed (and proven exploitable) when package managers are updating the setuid bits of installed programs. Such files start with being world-executable but then are adjusted to be group-exec with a set-uid bit. For example, "chmod o-x,u+s target" makes "target" executable only by uid "root" and gid "cdrom", while also becoming setuid-root:
-rwxr-xr-x 1 root cdrom 16048 Aug 7 13:16 target
becomes:
-rwsr-xr-- 1 root cdrom 16048 Aug 7 13:16 target
But racing the chmod means users without group "cdrom" membership can get the permission to execute "target" just before the chmod, and when the chmod finishes, the exec reaches brpmfilluid(), and performs the setuid to root, violating the expressed authorization of "only cdrom group members can setuid to root".
Re-check that we still have execute permissions in case the metadata has changed. It would be better to keep a copy from the perm-check time, but until we can do that refactoring, the least-bad option is to do a full inode_permission() call (under inode lock). It is understood that this is safe against dead-locks, but hardly optimal.(CVE-2024-43882)
In the Linux kernel, the following vulnerability has been resolved:
jfs: Fix shift-out-of-bounds in dbDiscardAG
When searching for the next smaller log2 block, BLKSTOL2() returned 0, causing shift exponent -1 to be negative.
This patch fixes the issue by exiting the loop directly when negative shift is found.(CVE-2024-44938)
In the Linux kernel, the following vulnerability has been resolved:
f2fs: fix to cover read extent cache access with lock
syzbot reports a f2fs bug as below:
BUG: KASAN: slab-use-after-free in sanitycheckextentcache+0x370/0x410 fs/f2fs/extentcache.c:46 Read of size 4 at addr ffff8880739ab220 by task syz-executor200/5097
CPU: 0 PID: 5097 Comm: syz-executor200 Not tainted 6.9.0-rc6-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024 Call Trace: <TASK> _dumpstack lib/dumpstack.c:88 [inline] dumpstacklvl+0x241/0x360 lib/dumpstack.c:114 printaddressdescription mm/kasan/report.c:377 [inline] printreport+0x169/0x550 mm/kasan/report.c:488 kasanreport+0x143/0x180 mm/kasan/report.c:601 sanitycheckextentcache+0x370/0x410 fs/f2fs/extentcache.c:46 doreadinode fs/f2fs/inode.c:509 [inline] f2fsiget+0x33e1/0x46e0 fs/f2fs/inode.c:560 f2fsnfsgetinode+0x74/0x100 fs/f2fs/super.c:3237 genericfhtodentry+0x9f/0xf0 fs/libfs.c:1413 exportfsdecodefhraw+0x152/0x5f0 fs/exportfs/expfs.c:444 exportfsdecodefh+0x3c/0x80 fs/exportfs/expfs.c:584 dohandletopath fs/fhandle.c:155 [inline] handletopath fs/fhandle.c:210 [inline] dohandleopen+0x495/0x650 fs/fhandle.c:226 dosyscallx64 arch/x86/entry/common.c:52 [inline] dosyscall64+0xf5/0x240 arch/x86/entry/common.c:83 entrySYSCALL64after_hwframe+0x77/0x7f
We missed to cover sanitycheckextent_cache() w/ extent cache lock, so, below race case may happen, result in use after free issue.
let's refactor sanitycheckextentcache() to avoid extent cache access and call it before f2fsinitreadextent_tree() to fix this issue.(CVE-2024-44941)
{ "severity": "High" }
{ "x86_64": [ "bpftool-6.6.0-39.0.0.47.oe2403.x86_64.rpm", "bpftool-debuginfo-6.6.0-39.0.0.47.oe2403.x86_64.rpm", "kernel-6.6.0-39.0.0.47.oe2403.x86_64.rpm", "kernel-debuginfo-6.6.0-39.0.0.47.oe2403.x86_64.rpm", "kernel-debugsource-6.6.0-39.0.0.47.oe2403.x86_64.rpm", "kernel-devel-6.6.0-39.0.0.47.oe2403.x86_64.rpm", "kernel-headers-6.6.0-39.0.0.47.oe2403.x86_64.rpm", "kernel-source-6.6.0-39.0.0.47.oe2403.x86_64.rpm", "kernel-tools-6.6.0-39.0.0.47.oe2403.x86_64.rpm", "kernel-tools-debuginfo-6.6.0-39.0.0.47.oe2403.x86_64.rpm", "kernel-tools-devel-6.6.0-39.0.0.47.oe2403.x86_64.rpm", "perf-6.6.0-39.0.0.47.oe2403.x86_64.rpm", "perf-debuginfo-6.6.0-39.0.0.47.oe2403.x86_64.rpm", "python3-perf-6.6.0-39.0.0.47.oe2403.x86_64.rpm", "python3-perf-debuginfo-6.6.0-39.0.0.47.oe2403.x86_64.rpm" ], "src": [ "kernel-6.6.0-39.0.0.47.oe2403.src.rpm" ], "aarch64": [ "bpftool-6.6.0-39.0.0.47.oe2403.aarch64.rpm", "bpftool-debuginfo-6.6.0-39.0.0.47.oe2403.aarch64.rpm", "kernel-6.6.0-39.0.0.47.oe2403.aarch64.rpm", "kernel-debuginfo-6.6.0-39.0.0.47.oe2403.aarch64.rpm", "kernel-debugsource-6.6.0-39.0.0.47.oe2403.aarch64.rpm", "kernel-devel-6.6.0-39.0.0.47.oe2403.aarch64.rpm", "kernel-headers-6.6.0-39.0.0.47.oe2403.aarch64.rpm", "kernel-source-6.6.0-39.0.0.47.oe2403.aarch64.rpm", "kernel-tools-6.6.0-39.0.0.47.oe2403.aarch64.rpm", "kernel-tools-debuginfo-6.6.0-39.0.0.47.oe2403.aarch64.rpm", "kernel-tools-devel-6.6.0-39.0.0.47.oe2403.aarch64.rpm", "perf-6.6.0-39.0.0.47.oe2403.aarch64.rpm", "perf-debuginfo-6.6.0-39.0.0.47.oe2403.aarch64.rpm", "python3-perf-6.6.0-39.0.0.47.oe2403.aarch64.rpm", "python3-perf-debuginfo-6.6.0-39.0.0.47.oe2403.aarch64.rpm" ] }