The Linux Kernel, the operating system core itself.
Security Fix(es):
In the Linux kernel, the following vulnerability has been resolved:
ksmbd: fix potencial out-of-bounds when buffer offset is invalid
I found potencial out-of-bounds when buffer offset fields of a few requests is invalid. This patch set the minimum value of buffer offset field to ->Buffer offset to validate buffer length.(CVE-2024-26952)
In the Linux kernel, the following vulnerability has been resolved:
ksmbd: fix slab-out-of-bounds in smbstrndupfrom_utf16()
If ->NameOffset of smb2createreq is smaller than Buffer offset of smb2createreq, slab-out-of-bounds read can happen from smb2open. This patch set the minimum value of the name offset to the buffer offset to validate name length of smb2create_req().(CVE-2024-26954)
In the Linux kernel, the following vulnerability has been resolved:
fpga: bridge: add owner module and take its refcount
The current implementation of the fpga bridge assumes that the low-level module registers a driver for the parent device and uses its owner pointer to take the module's refcount. This approach is problematic since it can lead to a null pointer dereference while attempting to get the bridge if the parent device does not have a driver.
To address this problem, add a module owner pointer to the fpga_bridge struct and use it to take the module's refcount. Modify the function for registering a bridge to take an additional owner module parameter and rename it to avoid conflicts. Use the old function name for a helper macro that automatically sets the module that registers the bridge as the owner. This ensures compatibility with existing low-level control modules and reduces the chances of registering a bridge without setting the owner.
Also, update the documentation to keep it consistent with the new interface for registering an fpga bridge.
Other changes: opportunistically move putdevice() from _fpgabridgeget() to fpgabridgeget() and offpgabridge_get() to improve code clarity since the bridge device is taken in these functions.(CVE-2024-36479)
In the Linux kernel, the following vulnerability has been resolved:
blk-iocost: avoid out of bounds shift
UBSAN catches undefined behavior in blk-iocost, where sometimes iocg->delay is shifted right by a number that is too large, resulting in undefined behavior on some architectures.
[ 186.556576] ------------[ cut here ]------------ UBSAN: shift-out-of-bounds in block/blk-iocost.c:1366:23 shift exponent 64 is too large for 64-bit type 'u64' (aka 'unsigned long long') CPU: 16 PID: 0 Comm: swapper/16 Tainted: G S E N 6.9.0-0fbk700debugrc2kbuilder0gc85af715cac0 #1 Hardware name: Quanta Twin Lakes MP/Twin Lakes Passive MP, BIOS F093A23 12/08/2020 Call Trace: <IRQ> dumpstacklvl+0x8f/0xe0 _ubsanhandleshiftoutofbounds+0x22c/0x280 iocgkickdelay+0x30b/0x310 ioctimerfn+0x2fb/0x1f80 _runtimerbase+0x1b6/0x250 ...
Avoid that undefined behavior by simply taking the "delay = 0" branch if the shift is too large.
I am not sure what the symptoms of an undefined value delay will be, but I suspect it could be more than a little annoying to debug.(CVE-2024-36916)
In the Linux kernel, the following vulnerability has been resolved:
fpga: manager: add owner module and take its refcount
The current implementation of the fpga manager assumes that the low-level module registers a driver for the parent device and uses its owner pointer to take the module's refcount. This approach is problematic since it can lead to a null pointer dereference while attempting to get the manager if the parent device does not have a driver.
To address this problem, add a module owner pointer to the fpga_manager struct and use it to take the module's refcount. Modify the functions for registering the manager to take an additional owner module parameter and rename them to avoid conflicts. Use the old function names for helper macros that automatically set the module that registers the manager as the owner. This ensures compatibility with existing low-level control modules and reduces the chances of registering a manager without setting the owner.
Also, update the documentation to keep it consistent with the new interface for registering an fpga manager.
Other changes: opportunistically move putdevice() from _fpgamgrget() to fpgamgrget() and offpgamgr_get() to improve code clarity since the manager device is taken in these functions.(CVE-2024-37021)
In the Linux kernel, the following vulnerability has been resolved:
thermal/drivers/tsens: Fix null pointer dereference
computeinterceptslope() is called from calibrate8960() (in tsens-8960.c) as computeinterceptslope(priv, p1, NULL, ONEPTCALIB) which lead to null pointer dereference (if DEBUG or DYNAMICDEBUG set). Fix this bug by adding null pointer check.
Found by Linux Verification Center (linuxtesting.org) with SVACE.(CVE-2024-38571)
In the Linux kernel, the following vulnerability has been resolved:
wifi: brcmfmac: pcie: handle randbuf allocation failure
The kzalloc() in brcmfpciedownloadfwnvram() will return null if the physical memory has run out. As a result, if we use getrandombytes() to generate random bytes in the randbuf, the null pointer dereference bug will happen.
In order to prevent allocation failure, this patch adds a separate function using buffer on kernel stack to generate random bytes in the randbuf, which could prevent the kernel stack from overflow.(CVE-2024-38575)
In the Linux kernel, the following vulnerability has been resolved:
tools/nolibc/stdlib: fix memory error in realloc()
Pass userplen to memcpy() instead of heap->len to prevent realloc() from copying an extra sizeof(heap) bytes from beyond the allocated region.(CVE-2024-38585)
In the Linux kernel, the following vulnerability has been resolved:
media: stk1160: fix bounds checking in stk1160copyvideo()
The subtract in this condition is reversed. The ->length is the length of the buffer. The ->bytesused is how many bytes we have copied thus far. When the condition is reversed that means the result of the subtraction is always negative but since it's unsigned then the result is a very high positive value. That means the overflow check is never true.
Additionally, the ->bytesused doesn't actually work for this purpose because we're not writing to "buf->mem + buf->bytesused". Instead, the math to calculate the destination where we are writing is a bit involved. You calculate the number of full lines already written, multiply by two, skip a line if necessary so that we start on an odd numbered line, and add the offset into the line.
To fix this buffer overflow, just take the actual destination where we are writing, if the offset is already out of bounds print an error and return. Otherwise, write up to buf->length bytes.(CVE-2024-38621)
In the Linux kernel, the following vulnerability has been resolved:
mm/vmalloc: fix vmalloc which may return null if called with _GFPNOFAIL
commit a421ef303008 ("mm: allow !GFPKERNEL allocations for kvmalloc") includes support for _GFP_NOFAIL, but it presents a conflict with commit dd544141b9eb ("vmalloc: back off when the current task is OOM-killed"). A possible scenario is as follows:
process-a _vmallocnoderange(GFPKERNEL | _GFPNOFAIL) _vmallocareanode() vmareaallocpages() --> oom-killer send SIGKILL to process-a if (fatalsignalpending(current)) break; --> return NULL;
To fix this, do not check fatalsignalpending() in vmareaallocpages() if _GFP_NOFAIL set.
This issue occurred during OPLUS KASAN TEST. Below is part of the log -> oom-killer sends signal to process [65731.222840] [ T1308] oom-kill:constraint=CONSTRAINTNONE,nodemask=(null),cpuset=/,memsallowed=0,globaloom,taskmemcg=/apps/uid_10198,task=gs.intelligence,pid=32454,uid=10198
[65731.259685] [T32454] Call trace: [65731.259698] [T32454] dumpbacktrace+0xf4/0x118 [65731.259734] [T32454] showstack+0x18/0x24 [65731.259756] [T32454] dumpstacklvl+0x60/0x7c [65731.259781] [T32454] dumpstack+0x18/0x38 [65731.259800] [T32454] mrdumpcommondie+0x250/0x39c [mrdump] [65731.259936] [T32454] ipanicdie+0x20/0x34 [mrdump] [65731.260019] [T32454] atomicnotifiercallchain+0xb4/0xfc [65731.260047] [T32454] notifydie+0x114/0x198 [65731.260073] [T32454] die+0xf4/0x5b4 [65731.260098] [T32454] diekernelfault+0x80/0x98 [65731.260124] [T32454] _dokernelfault+0x160/0x2a8 [65731.260146] [T32454] dobadarea+0x68/0x148 [65731.260174] [T32454] domemabort+0x151c/0x1b34 [65731.260204] [T32454] el1abort+0x3c/0x5c [65731.260227] [T32454] el1h64synchandler+0x54/0x90 [65731.260248] [T32454] el1h64_sync+0x68/0x6c
[65731.260269] [T32454] zerofsdecompressqueue+0x7f0/0x2258 --> be->decompressedpages = kvcalloc(be->nrpages, sizeof(struct page *), GFPKERNEL | _GFPNOFAIL); kernel panic by NULL pointer dereference. erofs assume kvmalloc with _GFPNOFAIL never return NULL. [65731.260293] [T32454] zerofsrunqueue+0xf30/0x104c [65731.260314] [T32454] zerofsreadahead+0x4f0/0x968 [65731.260339] [T32454] readpages+0x170/0xadc [65731.260364] [T32454] pagecacheraunbounded+0x874/0xf30 [65731.260388] [T32454] pagecacheraorder+0x24c/0x714 [65731.260411] [T32454] filemapfault+0xbf0/0x1a74 [65731.260437] [T32454] _dofault+0xd0/0x33c [65731.260462] [T32454] handlemmfault+0xf74/0x3fe0 [65731.260486] [T32454] domemabort+0x54c/0x1b34 [65731.260509] [T32454] el0da+0x44/0x94 [65731.260531] [T32454] el0t64synchandler+0x98/0xb4 [65731.260553] [T32454] el0t64sync+0x198/0x19c(CVE-2024-39474)
In the Linux kernel, the following vulnerability has been resolved:
bcache: fix variable length array abuse in btree_iter
btreeiter is used in two ways: either allocated on the stack with a fixed size MAXBSETS, or from a mempool with a dynamic size based on the specific cache set. Previously, the struct had a fixed-length array of size MAX_BSETS which was indexed out-of-bounds for the dynamically-sized iterators, which causes UBSAN to complain.
This patch uses the same approach as in bcachefs's sortiter and splits the iterator into a btreeiter with a flexible array member and a btreeiterstack which embeds a btree_iter as well as a fixed-length data array.(CVE-2024-39482)
In the Linux kernel, the following vulnerability has been resolved:
greybus: Fix use-after-free bug in gbinterfacerelease due to race condition.
In gbinterfacecreate, &intf->modeswitchcompletion is bound with gbinterfacemodeswitchwork. Then it will be started by gbinterfacerequestmodeswitch. Here is the relevant code. if (!queuework(systemlongwq, &intf->modeswitch_work)) { ... }
If we call gbinterfacerelease to make cleanup, there may be an unfinished work. This function will call kfree to free the object "intf". However, if gbinterfacemodeswitchwork is scheduled to run after kfree, it may cause use-after-free error as gbinterfacemodeswitchwork will use the object "intf". The possible execution flow that may lead to the issue is as follows:
CPU0 CPU1
| gb_interface_create
| gb_interface_request_mode_switch
gbinterfacerelease | kfree(intf) (free) | | gbinterfacemodeswitchwork | mutex_lock(&intf->mutex) (use)
Fix it by canceling the work before kfree.(CVE-2024-39495)
In the Linux kernel, the following vulnerability has been resolved:
drm/i915/dpt: Make DPT object unshrinkable
In some scenarios, the DPT object gets shrunk but the actual framebuffer did not and thus its still there on the DPT's vm->bound_list. Then it tries to rewrite the PTEs via a stale CPU mapping. This causes panic.
[vsyrjala: Add TODO comment] (cherry picked from commit 51064d471c53dcc8eddd2333c3f1c1d9131ba36c)(CVE-2024-40924)
In the Linux kernel, the following vulnerability has been resolved:
gve: Clear napi->skb before devkfreeskb_any()
gverxfreeskb incorrectly leaves napi->skb referencing an skb after it is freed with devkfreeskbany(). This can result in a subsequent call to napigetfrags returning a dangling pointer.
Fix this by clearing napi->skb before the skb is freed.(CVE-2024-40937)
In the Linux kernel, the following vulnerability has been resolved:
mm: shmem: fix getting incorrect lruvec when replacing a shmem folio
When testing shmem swapin, I encountered the warning below on my machine. The reason is that replacing an old shmem folio with a new one causes memcgroupmigrate() to clear the old folio's memcg data. As a result, the old folio cannot get the correct memcg's lruvec needed to remove itself from the LRU list when it is being freed. This could lead to possible serious problems, such as LRU list crashes due to holding the wrong LRU lock, and incorrect LRU statistics.
To fix this issue, we can fallback to use the memcgroupreplace_folio() to replace the old shmem folio.
[ 5241.100311] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x5d9960 [ 5241.100317] head: order:4 mapcount:0 entiremapcount:0 nrpagesmapped:0 pincount:0 [ 5241.100319] flags: 0x17fffe0000040068(uptodate|lru|head|swapbacked|node=0|zone=2|lastcpupid=0x3ffff) [ 5241.100323] raw: 17fffe0000040068 fffffdffd6687948 fffffdffd69ae008 0000000000000000 [ 5241.100325] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 [ 5241.100326] head: 17fffe0000040068 fffffdffd6687948 fffffdffd69ae008 0000000000000000 [ 5241.100327] head: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 [ 5241.100328] head: 17fffe0000000204 fffffdffd6665801 ffffffffffffffff 0000000000000000 [ 5241.100329] head: 0000000a00000010 0000000000000000 00000000ffffffff 0000000000000000 [ 5241.100330] page dumped because: VMWARNONONCEFOLIO(!memcg && !memcgroupdisabled()) [ 5241.100338] ------------[ cut here ]------------ [ 5241.100339] WARNING: CPU: 19 PID: 78402 at include/linux/memcontrol.h:775 foliolruveclockirqsave+0x140/0x150 [...] [ 5241.100374] pc : foliolruveclockirqsave+0x140/0x150 [ 5241.100375] lr : foliolruveclockirqsave+0x138/0x150 [ 5241.100376] sp : ffff80008b38b930 [...] [ 5241.100398] Call trace: [ 5241.100399] foliolruveclockirqsave+0x140/0x150 [ 5241.100401] _pagecacherelease+0x90/0x300 [ 5241.100404] _folioput+0x50/0x108 [ 5241.100406] shmemreplacefolio+0x1b4/0x240 [ 5241.100409] shmemswapinfolio+0x314/0x528 [ 5241.100411] shmemgetfoliogfp+0x3b4/0x930 [ 5241.100412] shmemfault+0x74/0x160 [ 5241.100414] _dofault+0x40/0x218 [ 5241.100417] dosharedfault+0x34/0x1b0 [ 5241.100419] dofault+0x40/0x168 [ 5241.100420] handleptefault+0x80/0x228 [ 5241.100422] _handlemmfault+0x1c4/0x440 [ 5241.100424] handlemmfault+0x60/0x1f0 [ 5241.100426] dopagefault+0x120/0x488 [ 5241.100429] dotranslationfault+0x4c/0x68 [ 5241.100431] domemabort+0x48/0xa0 [ 5241.100434] el0da+0x38/0xc0 [ 5241.100436] el0t64synchandler+0x68/0xc0 [ 5241.100437] el0t64sync+0x14c/0x150 [ 5241.100439] ---[ end trace 0000000000000000 ]---
[baolin.wang@linux.alibaba.com: remove less helpful comments, per Matthew] Link: https://lkml.kernel.org/r/ccad3fe1375b468ebca3227b6b729f3eaf9d8046.1718423197.git.baolin.wang@linux.alibaba.com(CVE-2024-40949)
In the Linux kernel, the following vulnerability has been resolved:
ipv6: prevent possible NULL deref in fib6nhinit()
syzbot reminds us that in6devget() can return NULL.
fib6nhinit() ip6validategw( &idev ) ip6routechecknh( idev ) *idev = in6dev_get(dev); // can be NULL
Oops: general protection fault, probably for non-canonical address 0xdffffc00000000bc: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x00000000000005e0-0x00000000000005e7] CPU: 0 PID: 11237 Comm: syz-executor.3 Not tainted 6.10.0-rc2-syzkaller-00249-gbe27b8965297 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024 RIP: 0010:fib6nhinit+0x640/0x2160 net/ipv6/route.c:3606 Code: 00 00 fc ff df 4c 8b 64 24 58 48 8b 44 24 28 4c 8b 74 24 30 48 89 c1 48 89 44 24 28 48 8d 98 e0 05 00 00 48 89 d8 48 c1 e8 03 <42> 0f b6 04 38 84 c0 0f 85 b3 17 00 00 8b 1b 31 ff 89 de e8 b8 8b RSP: 0018:ffffc900032775a0 EFLAGS: 00010202 RAX: 00000000000000bc RBX: 00000000000005e0 RCX: 0000000000000000 RDX: 0000000000000010 RSI: ffffc90003277a54 RDI: ffff88802b3a08d8 RBP: ffffc900032778b0 R08: 00000000000002fc R09: 0000000000000000 R10: 00000000000002fc R11: 0000000000000000 R12: ffff88802b3a08b8 R13: 1ffff9200064eec8 R14: ffffc90003277a00 R15: dffffc0000000000 FS: 00007f940feb06c0(0000) GS:ffff8880b9400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000000245e8000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> ip6routeinfocreate+0x99e/0x12b0 net/ipv6/route.c:3809 ip6routeadd+0x28/0x160 net/ipv6/route.c:3853 ipv6routeioctl+0x588/0x870 net/ipv6/route.c:4483 inet6ioctl+0x21a/0x280 net/ipv6/afinet6.c:579 sockdoioctl+0x158/0x460 net/socket.c:1222 sockioctl+0x629/0x8e0 net/socket.c:1341 vfsioctl fs/ioctl.c:51 [inline] _dosysioctl fs/ioctl.c:907 [inline] _sesysioctl+0xfc/0x170 fs/ioctl.c:893 dosyscallx64 arch/x86/entry/common.c:52 [inline] dosyscall64+0xf3/0x230 arch/x86/entry/common.c:83 entrySYSCALL64after_hwframe+0x77/0x7f RIP: 0033:0x7f940f07cea9(CVE-2024-40961)
In the Linux kernel, the following vulnerability has been resolved:
i2c: lpi2c: Avoid calling clkgetrate during transfer
Instead of repeatedly calling clkgetrate for each transfer, lock the clock rate and cache the value. A deadlock has been observed while adding tlv320aic32x4 audio codec to the system. When this clock provider adds its clock, the clk mutex is locked already, it needs to access i2c, which in return needs the mutex for clkgetrate as well.(CVE-2024-40965)
In the Linux kernel, the following vulnerability has been resolved:
KVM: arm64: Disassociate vcpus from redistributor region on teardown
When tearing down a redistributor region, make sure we don't have any dangling pointer to that region stored in a vcpu.(CVE-2024-40989)
In the Linux kernel, the following vulnerability has been resolved:
io_uring/sqpoll: work around a potential audit memory leak
kmemleak complains that there's a memory leak related to connect handling:
unreferenced object 0xffff0001093bdf00 (size 128): comm "iou-sqp-455", pid 457, jiffies 4294894164 hex dump (first 32 bytes): 02 00 fa ea 7f 00 00 01 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace (crc 2e481b1a): [<00000000c0a26af4>] kmemleakalloc+0x30/0x38 [<000000009c30bb45>] kmalloctrace+0x228/0x358 [<000000009da9d39f>] _auditsockaddr+0xd0/0x138 [<0000000089a93e34>] moveaddrtokernel+0x1a0/0x1f8 [<000000000b4e80e6>] ioconnectprep+0x1ec/0x2d4 [<00000000abfbcd99>] iosubmitsqes+0x588/0x1e48 [<00000000e7c25e07>] iosqthread+0x8a4/0x10e4 [<00000000d999b491>] retfrom_fork+0x10/0x20
which can can happen if:
1) The command type does something on the prep side that triggers an audit call. 2) The thread hasn't done any operations before this that triggered an audit call inside ->issue(), where we have audituringentry() and audituringexit().
Work around this by issuing a blanket NOP operation before the SQPOLL does anything.(CVE-2024-41001)
In the Linux kernel, the following vulnerability has been resolved:
mm: vmalloc: check if a hash-index is in cpupossiblemask
The problem is that there are systems where cpupossiblemask has gaps between set CPUs, for example SPARC. In this scenario addrtovbxa() hash function can return an index which accesses to not-possible and not setup CPU area using percpu() macro. This results in an oops on SPARC.
A per-cpu vmapblockqueue is also used as hash table, incorrectly assuming the cpupossiblemask has no gaps. Fix it by adjusting an index to a next possible CPU.(CVE-2024-41032)
In the Linux kernel, the following vulnerability has been resolved:
net: ntbnetdev: Move ntbnetdevrxhandler() to call netifrx() from _netif_rx()
The following is emitted when using idxd (DSA) dmanegine as the data mover for ntbtransport that ntbnetdev uses.
[74412.546922] BUG: using smpprocessorid() in preemptible [00000000] code: irq/52-idxd-por/14526 [74412.556784] caller is netifrxinternal+0x42/0x130 [74412.562282] CPU: 6 PID: 14526 Comm: irq/52-idxd-por Not tainted 6.9.5 #5 [74412.569870] Hardware name: Intel Corporation ArcherCity/ArcherCity, BIOS EGSDCRB1.E9I.1752.P05.2402080856 02/08/2024 [74412.581699] Call Trace: [74412.584514] <TASK> [74412.586933] dumpstacklvl+0x55/0x70 [74412.591129] checkpreemptiondisabled+0xc8/0xf0 [74412.596374] netifrxinternal+0x42/0x130 [74412.600957] _netifrx+0x20/0xd0 [74412.604743] ntbnetdevrxhandler+0x66/0x150 [ntbnetdev] [74412.610985] ntbcompleterxc+0xed/0x140 [ntbtransport] [74412.617010] ntbrxcopycallback+0x53/0x80 [ntbtransport] [74412.623332] idxddmacompletetxd+0xe3/0x160 [idxd] [74412.628963] idxdwqthread+0x1a6/0x2b0 [idxd] [74412.634046] irqthreadfn+0x21/0x60 [74412.638134] ? irqthread+0xa8/0x290 [74412.642218] irqthread+0x1a0/0x290 [74412.646212] ? _pfxirqthreadfn+0x10/0x10 [74412.651071] ? _pfxirqthreaddtor+0x10/0x10 [74412.656117] ? _pfxirqthread+0x10/0x10 [74412.660686] kthread+0x100/0x130 [74412.664384] ? _pfxkthread+0x10/0x10 [74412.668639] retfromfork+0x31/0x50 [74412.672716] ? _pfxkthread+0x10/0x10 [74412.676978] retfromforkasm+0x1a/0x30 [74412.681457] </TASK>
The cause is due to the idxd driver interrupt completion handler uses threaded interrupt and the threaded handler is not hard or soft interrupt context. However _netifrx() can only be called from interrupt context. Change the call to netif_rx() in order to allow completion via normal context for dmaengine drivers that utilize threaded irq handling.
While the following commit changed from netifrx() to _netifrx(), baebdf48c360 ("net: dev: Makes sure netifrx() can be invoked in any context."), the change should've been a noop instead. However, the code precedes this fix should've been using netifrxni() or netifrxany_context().(CVE-2024-42110)
In the Linux kernel, the following vulnerability has been resolved:
mm: pageref: remove foliotrygetrcu()
The below bug was reported on a non-SMP kernel:
[ 275.267158][ T4335] ------------[ cut here ]------------ [ 275.267949][ T4335] kernel BUG at include/linux/pageref.h:275! [ 275.268526][ T4335] invalid opcode: 0000 [#1] KASAN PTI [ 275.269001][ T4335] CPU: 0 PID: 4335 Comm: trinity-c3 Not tainted 6.7.0-rc4-00061-gefa7df3e3bb5 #1 [ 275.269787][ T4335] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 [ 275.270679][ T4335] RIP: 0010:trygetfolio (include/linux/pageref.h:275 (discriminator 3) mm/gup.c:79 (discriminator 3)) [ 275.272813][ T4335] RSP: 0018:ffffc90005dcf650 EFLAGS: 00010202 [ 275.273346][ T4335] RAX: 0000000000000246 RBX: ffffea00066e0000 RCX: 0000000000000000 [ 275.274032][ T4335] RDX: fffff94000cdc007 RSI: 0000000000000004 RDI: ffffea00066e0034 [ 275.274719][ T4335] RBP: ffffea00066e0000 R08: 0000000000000000 R09: fffff94000cdc006 [ 275.275404][ T4335] R10: ffffea00066e0037 R11: 0000000000000000 R12: 0000000000000136 [ 275.276106][ T4335] R13: ffffea00066e0034 R14: dffffc0000000000 R15: ffffea00066e0008 [ 275.276790][ T4335] FS: 00007fa2f9b61740(0000) GS:ffffffff89d0d000(0000) knlGS:0000000000000000 [ 275.277570][ T4335] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 275.278143][ T4335] CR2: 00007fa2f6c00000 CR3: 0000000134b04000 CR4: 00000000000406f0 [ 275.278833][ T4335] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 275.279521][ T4335] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 275.280201][ T4335] Call Trace: [ 275.280499][ T4335] <TASK> [ 275.280751][ T4335] ? die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434 arch/x86/kernel/dumpstack.c:447) [ 275.281087][ T4335] ? dotrap (arch/x86/kernel/traps.c:112 arch/x86/kernel/traps.c:153) [ 275.281463][ T4335] ? trygetfolio (include/linux/pageref.h:275 (discriminator 3) mm/gup.c:79 (discriminator 3)) [ 275.281884][ T4335] ? trygetfolio (include/linux/pageref.h:275 (discriminator 3) mm/gup.c:79 (discriminator 3)) [ 275.282300][ T4335] ? doerrortrap (arch/x86/kernel/traps.c:174) [ 275.282711][ T4335] ? trygetfolio (include/linux/pageref.h:275 (discriminator 3) mm/gup.c:79 (discriminator 3)) [ 275.283129][ T4335] ? handleinvalidop (arch/x86/kernel/traps.c:212) [ 275.283561][ T4335] ? trygetfolio (include/linux/pageref.h:275 (discriminator 3) mm/gup.c:79 (discriminator 3)) [ 275.283990][ T4335] ? excinvalidop (arch/x86/kernel/traps.c:264) [ 275.284415][ T4335] ? asmexcinvalidop (arch/x86/include/asm/idtentry.h:568) [ 275.284859][ T4335] ? trygetfolio (include/linux/pageref.h:275 (discriminator 3) mm/gup.c:79 (discriminator 3)) [ 275.285278][ T4335] trygrabfolio (mm/gup.c:148) [ 275.285684][ T4335] getuserpages (mm/gup.c:1297 (discriminator 1)) [ 275.286111][ T4335] ? _pfxgetuserpages (mm/gup.c:1188) [ 275.286579][ T4335] ? pfxvalidatechain (kernel/locking/lockdep.c:3825) [ 275.287034][ T4335] ? marklock (kernel/locking/lockdep.c:4656 (discriminator 1)) [ 275.287416][ T4335] _guplongtermlocked (mm/gup.c:1509 mm/gup.c:2209) [ 275.288192][ T4335] ? _pfxguplongtermlocked (mm/gup.c:2204) [ 275.288697][ T4335] ? pfxlockacquire (kernel/locking/lockdep.c:5722) [ 275.289135][ T4335] ? _pfxmightresched (kernel/sched/core.c:10106) [ 275.289595][ T4335] pinuserpagesremote (mm/gup.c:3350) [ 275.290041][ T4335] ? _pfxpinuserpagesremote (mm/gup.c:3350) [ 275.290545][ T4335] ? findheldlock (kernel/locking/lockdep.c:5244 (discriminator 1)) [ 275.290961][ T4335] ? mmaccess (kernel/fork.c:1573) [ 275.291353][ T4335] processvmrwsinglevec+0x142/0x360 [ 275.291900][ T4335] ? _pfxprocessvmrwsinglevec+0x10/0x10 [ 275.292471][ T4335] ? mmaccess (kernel/fork.c:1573) [ 275.292859][ T4335] processvmrwcore+0x272/0x4e0 [ 275.293384][ T4335] ? hlockclass (a ---truncated---(CVE-2024-42251)
In the Linux kernel, the following vulnerability has been resolved:
f2fs: fix null reference error when checking end of zone
This patch fixes a potentially null pointer being accessed by isendzone_blkaddr() that checks the last block of a zone when f2fs is mounted as a single device.(CVE-2024-43857)
In the Linux kernel, the following vulnerability has been resolved:
perf: Fix event leak upon exit
When a task is scheduled out, pending sigtrap deliveries are deferred to the target task upon resume to userspace via task_work.
However failures while adding an event's callback to the task_work engine are ignored. And since the last call for events exit happen after task work is eventually closed, there is a small window during which pending sigtrap can be queued though ignored, leaking the event refcount addition such as in the following scenario:
TASK A
-----
do_exit()
exit_task_work(tsk);
<IRQ>
perf_event_overflow()
event->pending_sigtrap = pending_id;
irq_work_queue(&event->pending_irq);
</IRQ>
=========> PREEMPTION: TASK A -> TASK B
event_sched_out()
event->pending_sigtrap = 0;
atomic_long_inc_not_zero(&event->refcount)
// FAILS: task work has exited
task_work_add(&event->pending_task)
[...]
<IRQ WORK>
perf_pending_irq()
// early return: event->oncpu = -1
</IRQ WORK>
[...]
=========> TASK B -> TASK A
perf_event_exit_task(tsk)
perf_event_exit_event()
free_event()
WARN(atomic_long_cmpxchg(&event->refcount, 1, 0) != 1)
// leak event due to unexpected refcount == 2
As a result the event is never released while the task exits.
Fix this with appropriate taskworkadd()'s error handling.(CVE-2024-43870)
In the Linux kernel, the following vulnerability has been resolved:
PCI: endpoint: Clean up error handling in vpciscanbus()
Smatch complains about inconsistent NULL checking in vpciscanbus():
drivers/pci/endpoint/functions/pci-epf-vntb.c:1024 vpci_scan_bus() error: we previously assumed 'vpci_bus' could be null (see line 1021)
Instead of printing an error message and then crashing we should return an error code and clean up.
Also the NULL check is reversed so it prints an error for success instead of failure.(CVE-2024-43875)
In the Linux kernel, the following vulnerability has been resolved:
PCI: rcar: Demote WARN() to devwarnratelimited() in rcarpciewakeup()
Avoid large backtrace, it is sufficient to warn the user that there has been a link problem. Either the link has failed and the system is in need of maintenance, or the link continues to work and user has been informed. The message from the warning can be looked up in the sources.
This makes an actual link issue less verbose.
First of all, this controller has a limitation in that the controller driver has to assist the hardware with transition to L1 link state by writing L1IATN to PMCTRL register, the L1 and L0 link state switching is not fully automatic on this controller.
In case of an ASMedia ASM1062 PCIe SATA controller which does not support ASPM, on entry to suspend or during platform pmtest, the SATA controller enters D3hot state and the link enters L1 state. If the SATA controller wakes up before rcarpciewakeup() was called and returns to D0, the link returns to L0 before the controller driver even started its transition to L1 link state. At this point, the SATA controller did send an PMENTER_L1 DLLP to the PCIe controller and the PCIe controller received it, and the PCIe controller did set PMSR PMEL1RX bit.
Once rcarpciewakeup() is called, if the link is already back in L0 state and PMEL1RX bit is set, the controller driver has no way to determine if it should perform the link transition to L1 state, or treat the link as if it is in L0 state. Currently the driver attempts to perform the transition to L1 link state unconditionally, which in this specific case fails with a PMSR L1FAEG poll timeout, however the link still works as it is already back in L0 state.
Reduce this warning verbosity. In case the link is really broken, the rcarpcieconfig_access() would fail, otherwise it will succeed and any system with this controller and ASM1062 can suspend without generating a backtrace.(CVE-2024-43876)
In the Linux kernel, the following vulnerability has been resolved:
media: pci: ivtv: Add check for DMA map result
In case DMA fails, 'dma->SGlength' is 0. This value is later used to access 'dma->SGarray[dma->SGlength - 1]', which will cause out of bounds access.
Add check to return early on invalid value. Adjust warnings accordingly.
Found by Linux Verification Center (linuxtesting.org) with SVACE.(CVE-2024-43877)
In the Linux kernel, the following vulnerability has been resolved:
mlxsw: spectrumaclerp: Fix object nesting warning
ACLs in Spectrum-2 and newer ASICs can reside in the algorithmic TCAM (A-TCAM) or in the ordinary circuit TCAM (C-TCAM). The former can contain more ACLs (i.e., tc filters), but the number of masks in each region (i.e., tc chain) is limited.
In order to mitigate the effects of the above limitation, the device allows filters to share a single mask if their masks only differ in up to 8 consecutive bits. For example, dstip/25 can be represented using dstip/24 with a delta of 1 bit. The C-TCAM does not have a limit on the number of masks being used (and therefore does not support mask aggregation), but can contain a limited number of filters.
The driver uses the "objagg" library to perform the mask aggregation by passing it objects that consist of the filter's mask and whether the filter is to be inserted into the A-TCAM or the C-TCAM since filters in different TCAMs cannot share a mask.
The set of created objects is dependent on the insertion order of the filters and is not necessarily optimal. Therefore, the driver will periodically ask the library to compute a more optimal set ("hints") by looking at all the existing objects.
When the library asks the driver whether two objects can be aggregated the driver only compares the provided masks and ignores the A-TCAM / C-TCAM indication. This is the right thing to do since the goal is to move as many filters as possible to the A-TCAM. The driver also forbids two identical masks from being aggregated since this can only happen if one was intentionally put in the C-TCAM to avoid a conflict in the A-TCAM.
The above can result in the following set of hints:
H1: {mask X, A-TCAM} -> H2: {mask Y, A-TCAM} // X is Y + delta H3: {mask Y, C-TCAM} -> H4: {mask Z, A-TCAM} // Y is Z + delta
After getting the hints from the library the driver will start migrating filters from one region to another while consulting the computed hints and instructing the device to perform a lookup in both regions during the transition.
Assuming a filter with mask X is being migrated into the A-TCAM in the new region, the hints lookup will return H1. Since H2 is the parent of H1, the library will try to find the object associated with it and create it if necessary in which case another hints lookup (recursive) will be performed. This hints lookup for {mask Y, A-TCAM} will either return H2 or H3 since the driver passes the library an object comparison function that ignores the A-TCAM / C-TCAM indication.
This can eventually lead to nested objects which are not supported by the library [1].
Fix by removing the object comparison function from both the driver and the library as the driver was the only user. That way the lookup will only return exact matches.
I do not have a reliable reproducer that can reproduce the issue in a timely manner, but before the fix the issue would reproduce in several minutes and with the fix it does not reproduce in over an hour.
Note that the current usefulness of the hints is limited because they include the C-TCAM indication and represent aggregation that cannot actually happen. This will be addressed in net-next.
[1] WARNING: CPU: 0 PID: 153 at lib/objagg.c:170 objaggobjparentassign+0xb5/0xd0 Modules linked in: CPU: 0 PID: 153 Comm: kworker/0:18 Not tainted 6.9.0-rc6-custom-g70fbc2c1c38b #42 Hardware name: Mellanox Technologies Ltd. MSN3700C/VMOD0008, BIOS 5.11 10/10/2018 Workqueue: mlxswcore mlxswspacltcamvregionrehashwork RIP: 0010:objaggobjparentassign+0xb5/0xd0 [...] Call Trace: <TASK> _objaggobjget+0x2bb/0x580 objaggobjget+0xe/0x80 mlxswspaclerpmaskget+0xb5/0xf0 mlxswspaclatcamentryadd+0xe8/0x3c0 mlxswspacltcamentrycreate+0x5e/0xa0 mlxswspacltcamvchunkmigrateone+0x16b/0x270 mlxswspacltcamvregionrehashwork+0xbe/0x510 processone_work+0x151/0x370(CVE-2024-43880)
In the Linux kernel, the following vulnerability has been resolved:
wifi: ath12k: change DMA direction while mapping reinjected packets
For fragmented packets, ath12k reassembles each fragment as a normal packet and then reinjects it into HW ring. In this case, the DMA direction should be DMATODEVICE, not DMAFROMDEVICE. Otherwise, an invalid payload may be reinjected into the HW and subsequently delivered to the host.
Given that arbitrary memory can be allocated to the skb buffer, knowledge about the data contained in the reinjected buffer is lacking. Consequently, there’s a risk of private information being leaked.
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.1.1-00209-QCAHKSWPL_SILICONZ-1(CVE-2024-43881)
In the Linux kernel, the following vulnerability has been resolved:
xen: privcmd: Switch from mutex to spinlock for irqfds
irqfdwakeup() gets EPOLLHUP, when it is called by eventfdrelease() by way of wakeuppoll(&ctx->wqh, EPOLLHUP), which gets called under spinlockirqsave(). We can't use a mutex here as it will lead to a deadlock.
Fix it by switching over to a spin lock.(CVE-2024-44957)
In the Linux kernel, the following vulnerability has been resolved:
tick/broadcast: Move per CPU pointer access into the atomic section
The recent fix for making the take over of the broadcast timer more reliable retrieves a per CPU pointer in preemptible context.
This went unnoticed as compilers hoist the access into the non-preemptible region where the pointer is actually used. But of course it's valid that the compiler keeps it at the place where the code puts it which rightfully triggers:
BUG: using smpprocessorid() in preemptible [00000000] code: caller is hotplugcpubroadcasttick_pull+0x1c/0xc0
Move it to the actual usage site which is in a non-preemptible region.(CVE-2024-44968)
In the Linux kernel, the following vulnerability has been resolved:
btrfs: do not clear page dirty inside extentwritelocked_range()
[BUG] For subpage + zoned case, the following workload can lead to rsv data leak at unmount time:
# mkfs.btrfs -f -s 4k $dev # mount $dev $mnt # fsstress -w -n 8 -d $mnt -s 1709539240 0/0: fiemap - no filename 0/1: copyrange read - no filename 0/2: write - no filename 0/3: rename - no source filename 0/4: creat f0 x:0 0 0 0/4: creat add id=0,parent=-1 0/5: writev f0[259 1 0 0 0 0] [778052,113,965] 0 0/6: ioctl(FIEMAP) f0[259 1 0 0 224 887097] [1294220,2291618343991484791,0x10000] -1 0/7: dwrite - xfsctl(XFSIOCDIOINFO) f0[259 1 0 0 224 887097] return 25, fallback to stat() 0/7: dwrite f0[259 1 0 0 224 887097] [696320,102400] 0 # umount $mnt
The dmesg includes the following rsv leak detection warning (all call trace skipped):
------------[ cut here ]------------ WARNING: CPU: 2 PID: 4528 at fs/btrfs/inode.c:8653 btrfsdestroyinode+0x1e0/0x200 [btrfs] ---[ end trace 0000000000000000 ]--- ------------[ cut here ]------------ WARNING: CPU: 2 PID: 4528 at fs/btrfs/inode.c:8654 btrfsdestroyinode+0x1a8/0x200 [btrfs] ---[ end trace 0000000000000000 ]--- ------------[ cut here ]------------ WARNING: CPU: 2 PID: 4528 at fs/btrfs/inode.c:8660 btrfsdestroyinode+0x1a0/0x200 [btrfs] ---[ end trace 0000000000000000 ]--- BTRFS info (device sda): last unmount of filesystem 1b4abba9-de34-4f07-9e7f-157cf12a18d6 ------------[ cut here ]------------ WARNING: CPU: 3 PID: 4528 at fs/btrfs/block-group.c:4434 btrfsfreeblockgroups+0x338/0x500 [btrfs] ---[ end trace 0000000000000000 ]--- BTRFS info (device sda): spaceinfo DATA has 268218368 free, is not full BTRFS info (device sda): spaceinfo total=268435456, used=204800, pinned=0, reserved=0, mayuse=12288, readonly=0 zoneunusable=0 BTRFS info (device sda): globalblockrsv: size 0 reserved 0 BTRFS info (device sda): transblockrsv: size 0 reserved 0 BTRFS info (device sda): chunkblockrsv: size 0 reserved 0 BTRFS info (device sda): delayedblockrsv: size 0 reserved 0 BTRFS info (device sda): delayedrefsrsv: size 0 reserved 0 ------------[ cut here ]------------ WARNING: CPU: 3 PID: 4528 at fs/btrfs/block-group.c:4434 btrfsfreeblockgroups+0x338/0x500 [btrfs] ---[ end trace 0000000000000000 ]--- BTRFS info (device sda): spaceinfo METADATA has 267796480 free, is not full BTRFS info (device sda): spaceinfo total=268435456, used=131072, pinned=0, reserved=0, mayuse=262144, readonly=0 zoneunusable=245760 BTRFS info (device sda): globalblockrsv: size 0 reserved 0 BTRFS info (device sda): transblockrsv: size 0 reserved 0 BTRFS info (device sda): chunkblockrsv: size 0 reserved 0 BTRFS info (device sda): delayedblockrsv: size 0 reserved 0 BTRFS info (device sda): delayedrefsrsv: size 0 reserved 0
Above $dev is a tcmu-runner emulated zoned HDD, which has a max zone append size of 64K, and the system has 64K page size.
[CAUSE] I have added several trace_printk() to show the events (header skipped):
> btrfsdirtypages: r/i=5/259 dirty start=774144 len=114688 > btrfsdirtypages: r/i=5/259 dirty part of page=720896 offinpage=53248 leninpage=12288 > btrfsdirtypages: r/i=5/259 dirty part of page=786432 offinpage=0 leninpage=65536 > btrfsdirtypages: r/i=5/259 dirty part of page=851968 offinpage=0 leninpage=36864
The above lines show our buffered write has dirtied 3 pages of inode 259 of root 5:
704K 768K 832K 896K I |////I/////////////////I///////////| I 756K 868K
|///| is the dirtied range using subpage bitmaps. and 'I' is the page boundary.
Meanwhile all three pages (704K, 768K, 832K) have their PageDirty flag set.
> btrfsdirectwrite: r/i=5/259 start dio filepos=696320 len=102400
Then direct IO writ ---truncated---(CVE-2024-44972)
In the Linux kernel, the following vulnerability has been resolved:
cgroup/cpuset: fix panic caused by partcmd_update
We find a bug as below: BUG: unable to handle page fault for address: 00000003 PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 3 PID: 358 Comm: bash Tainted: G W I 6.6.0-10893-g60d6 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/4 RIP: 0010:partitionscheddomainslocked+0x483/0x600 Code: 01 48 85 d2 74 0d 48 83 05 29 3f f8 03 01 f3 48 0f bc c2 89 c0 48 9 RSP: 0018:ffffc90000fdbc58 EFLAGS: 00000202 RAX: 0000000100000003 RBX: ffff888100b3dfa0 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000002fe80 RBP: ffff888100b3dfb0 R08: 0000000000000001 R09: 0000000000000000 R10: ffffc90000fdbcb0 R11: 0000000000000004 R12: 0000000000000002 R13: ffff888100a92b48 R14: 0000000000000000 R15: 0000000000000000 FS: 00007f44a5425740(0000) GS:ffff888237d80000(0000) knlGS:0000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000100030973 CR3: 000000010722c000 CR4: 00000000000006e0 Call Trace: <TASK> ? showregs+0x8c/0xa0 ? _diebody+0x23/0xa0 ? _die+0x3a/0x50 ? pagefaultoops+0x1d2/0x5c0 ? partitionscheddomainslocked+0x483/0x600 ? searchmoduleextables+0x2a/0xb0 ? searchexceptiontables+0x67/0x90 ? kernelmodefixuporoops+0x144/0x1b0 ? _badareanosemaphore+0x211/0x360 ? upread+0x3b/0x50 ? badareanosemaphore+0x1a/0x30 ? excpagefault+0x890/0xd90 ? _lockacquire.constprop.0+0x24f/0x8d0 ? _lockacquire.constprop.0+0x24f/0x8d0 ? asmexcpagefault+0x26/0x30 ? partitionscheddomainslocked+0x483/0x600 ? partitionscheddomainslocked+0xf0/0x600 rebuildscheddomainslocked+0x806/0xdc0 updatepartitionsdlb+0x118/0x130 cpusetwriteresmask+0xffc/0x1420 cgroupfilewrite+0xb2/0x290 kernfsfopwriteiter+0x194/0x290 newsyncwrite+0xeb/0x160 vfswrite+0x16f/0x1d0 ksyswrite+0x81/0x180 _x64syswrite+0x21/0x30 x64syscall+0x2f25/0x4630 dosyscall64+0x44/0xb0 entrySYSCALL64afterhwframe+0x78/0xe2 RIP: 0033:0x7f44a553c887
It can be reproduced with cammands: cd /sys/fs/cgroup/ mkdir test cd test/ echo +cpuset > ../cgroup.subtree_control echo root > cpuset.cpus.partition cat /sys/fs/cgroup/cpuset.cpus.effective 0-3 echo 0-3 > cpuset.cpus // taking away all cpus from root
This issue is caused by the incorrect rebuilding of scheduling domains. In this scenario, test/cpuset.cpus.partition should be an invalid root and should not trigger the rebuilding of scheduling domains. When calling updateparenteffectivecpumask with partcmdupdate, if newmask is not null, it should recheck newmask whether there are cpus is available for parect/cs that has tasks.(CVE-2024-44975)
In the Linux kernel, the following vulnerability has been resolved:
net: mana: Fix RX buf alloc_size alignment and atomic op panic
The MANA driver's RX buffer allocsize is passed into napibuildskb() to create SKB. skbshinfo(skb) is located at the end of skb, and its alignment is affected by the allocsize passed into napibuildskb(). The size needs to be aligned properly for better performance and atomic operations. Otherwise, on ARM64 CPU, for certain MTU settings like 4000, atomic operations may panic on the skbshinfo(skb)->dataref due to alignment fault.
To fix this bug, add proper alignment to the alloc_size calculation.
Sample panic info: [ 253.298819] Unable to handle kernel paging request at virtual address ffff000129ba5cce [ 253.300900] Mem abort info: [ 253.301760] ESR = 0x0000000096000021 [ 253.302825] EC = 0x25: DABT (current EL), IL = 32 bits [ 253.304268] SET = 0, FnV = 0 [ 253.305172] EA = 0, S1PTW = 0 [ 253.306103] FSC = 0x21: alignment fault Call trace: _skbclone+0xfc/0x198 skbclone+0x78/0xe0 raw6localdeliver+0xfc/0x228 ip6protocoldeliverrcu+0x80/0x500 ip6inputfinish+0x48/0x80 ip6input+0x48/0xc0 ip6sublistrcvfinish+0x50/0x78 ip6sublistrcv+0x1cc/0x2b8 ipv6listrcv+0x100/0x150 _netifreceiveskblistcore+0x180/0x220 netifreceiveskblistinternal+0x198/0x2a8 _napipoll+0x138/0x250 netrxaction+0x148/0x330 handlesoftirqs+0x12c/0x3a0(CVE-2024-45001)
In the Linux kernel, the following vulnerability has been resolved:
KVM: s390: fix validity interception issue when gisa is switched off
We might run into a SIE validity if gisa has been disabled either via using kernel parameter "kvm.usegisa=0" or by setting the related sysfs attribute to N (echo N >/sys/module/kvm/parameters/usegisa).
The validity is caused by an invalid value in the SIE control block's gisa designation. That happens because we pass the uninitialized gisa origin to virttophys() before writing it to the gisa designation.
To fix this we return 0 in kvms390getgisadesc() if the origin is 0. kvms390getgisadesc() is used to determine which gisa designation to set in the SIE control block. A value of 0 in the gisa designation disables gisa usage.
The issue surfaces in the host kernel with the following kernel message as soon a new kvm guest start is attemted.
kvm: unhandled validity intercept 0x1011 WARNING: CPU: 0 PID: 781237 at arch/s390/kvm/intercept.c:101 kvmhandlesieintercept+0x42e/0x4d0 [kvm] Modules linked in: vhostnet tap tun xtCHECKSUM xtMASQUERADE xtconntrack iptREJECT xttcpudp nftcompat xtables nfnattftp nfconntracktftp vfiopcicore irqbypass vhostvsock vmwvsockvirtiotransportcommon vsock vhost vhostiotlb kvm nftfibinet nftfibipv4 nftfibipv6 nftfib nftrejectinet nfrejectipv4 nfrejectipv6 nftreject nftct nftchainnat nfnat nfconntrack nfdefragipv6 nfdefragipv4 ipset nftables sunrpc mlx5ib ibuverbs ibcore mlx5core uvdevice s390trng eadmsch vfioccw zcryptcex4 mdev vfioiommutype1 vfio schfqcodel drm i2ccore loop drmpanelorientationquirks configfs nfnetlink lcs ctcm fsm dmservicetime ghashs390 prng chachas390 libchacha aess390 dess390 libdes sha3512s390 sha3256s390 sha512s390 sha256s390 sha1s390 shacommon dmmirror dmregionhash dmlog zfcp scsitransportfc scsidhrdac scsidhemc scsidhalua pkey zcrypt dmmultipath rngcore autofs4 [last unloaded: vfiopci] CPU: 0 PID: 781237 Comm: CPU 0/KVM Not tainted 6.10.0-08682-gcad9f11498ea #6 Hardware name: IBM 3931 A01 701 (LPAR) Krnl PSW : 0704c00180000000 000003d93deb0122 (kvmhandlesieintercept+0x432/0x4d0 [kvm]) R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3 Krnl GPRS: 000003d900000027 000003d900000023 0000000000000028 000002cd00000000 000002d063a00900 00000359c6daf708 00000000000bebb5 0000000000001eff 000002cfd82e9000 000002cfd80bc000 0000000000001011 000003d93deda412 000003ff8962df98 000003d93de77ce0 000003d93deb011e 00000359c6daf960 Krnl Code: 000003d93deb0112: c020fffe7259 larl %r2,000003d93de7e5c4 000003d93deb0118: c0e53fa8beac brasl %r14,000003d9bd3c7e70 #000003d93deb011e: af000000 mc 0,0 >000003d93deb0122: a728ffea lhi %r2,-22 000003d93deb0126: a7f4fe24 brc 15,000003d93deafd6e 000003d93deb012a: 9101f0b0 tm 176(%r15),1 000003d93deb012e: a774fe48 brc 7,000003d93deafdbe 000003d93deb0132: 40a0f0ae sth %r10,174(%r15) Call Trace: [<000003d93deb0122>] kvmhandlesieintercept+0x432/0x4d0 [kvm] ([<000003d93deb011e>] kvmhandlesieintercept+0x42e/0x4d0 [kvm]) [<000003d93deacc10>] vcpupostrun+0x1d0/0x3b0 [kvm] [<000003d93deaceda>] _vcpurun+0xea/0x2d0 [kvm] [<000003d93dead9da>] kvmarchvcpuioctlrun+0x16a/0x430 [kvm] [<000003d93de93ee0>] kvmvcpuioctl+0x190/0x7c0 [kvm] [<000003d9bd728b4e>] vfsioctl+0x2e/0x70 [<000003d9bd72a092>] _s390xsysioctl+0xc2/0xd0 [<000003d9be0e9222>] _dosyscall+0x1f2/0x2e0 [<000003d9be0f9a90>] systemcall+0x70/0x98 Last Breaking-Event-Address: [<000003d9bd3c7f58>] _warn_printk+0xe8/0xf0(CVE-2024-45005)
In the Linux kernel, the following vulnerability has been resolved:
char: xillybus: Don't destroy workqueue from work item running on it
Triggered by a kref decrement, destroy_workqueue() may be called from within a work item for destroying its own workqueue. This illegal situation is averted by adding a module-global workqueue for exclusive use of the offending work item. Other work items continue to be queued on per-device workqueues to ensure performance.(CVE-2024-45007)
In the Linux kernel, the following vulnerability has been resolved:
nouveau/firmware: use dma non-coherent allocator
Currently, enabling SG_DEBUG in the kernel will cause nouveau to hit a BUG() on startup, when the iommu is enabled:
kernel BUG at include/linux/scatterlist.h:187! invalid opcode: 0000 [#1] PREEMPT SMP NOPTI CPU: 7 PID: 930 Comm: (udev-worker) Not tainted 6.9.0-rc3Lyude-Test+ #30 Hardware name: MSI MS-7A39/A320M GAMING PRO (MS-7A39), BIOS 1.I0 01/22/2019 RIP: 0010:sginitone+0x85/0xa0 Code: 69 88 32 01 83 e1 03 f6 c3 03 75 20 a8 01 75 1e 48 09 cb 41 89 54 24 08 49 89 1c 24 41 89 6c 24 0c 5b 5d 41 5c e9 7b b9 88 00 <0f> 0b 0f 0b 0f 0b 48 8b 05 5e 46 9a 01 eb b2 66 66 2e 0f 1f 84 00 RSP: 0018:ffffa776017bf6a0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffffa77600d87000 RCX: 000000000000002b RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffffa77680d87000 RBP: 000000000000e000 R08: 0000000000000000 R09: 0000000000000000 R10: ffff98f4c46aa508 R11: 0000000000000000 R12: ffff98f4c46aa508 R13: ffff98f4c46aa008 R14: ffffa77600d4a000 R15: ffffa77600d4a018 FS: 00007feeb5aae980(0000) GS:ffff98f5c4dc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f22cb9a4520 CR3: 00000001043ba000 CR4: 00000000003506f0 Call Trace: <TASK> ? die+0x36/0x90 ? dotrap+0xdd/0x100 ? sginitone+0x85/0xa0 ? doerrortrap+0x65/0x80 ? sginitone+0x85/0xa0 ? excinvalidop+0x50/0x70 ? sginitone+0x85/0xa0 ? asmexcinvalidop+0x1a/0x20 ? sginitone+0x85/0xa0 nvkmfirmwarector+0x14a/0x250 [nouveau] nvkmfalconfwctor+0x42/0x70 [nouveau] ga102gspbooterctor+0xb4/0x1a0 [nouveau] r535gsponeinit+0xb3/0x15f0 [nouveau] ? srsoreturnthunk+0x5/0x5f ? srsoreturnthunk+0x5/0x5f ? nvkmudevicenew+0x95/0x140 [nouveau] ? srsoreturnthunk+0x5/0x5f ? srsoreturnthunk+0x5/0x5f ? ktime_get+0x47/0xb0
Fix this by using the non-coherent allocator instead, I think there might be a better answer to this, but it involve ripping up some of APIs using sg lists.(CVE-2024-45012)
In the Linux kernel, the following vulnerability has been resolved:
mm/vmalloc: fix page mapping if vmareaalloc_pages() with high order fallback to order 0
The _vmappagesrangenoflush() assumes its argument pages* contains pages with the same page shift. However, since commit e9c3cda4d86e ("mm, vmalloc: fix high order __GFP_NOFAIL allocations"), if gfp_flags includes __GFP_NOFAIL with high order in vm_area_alloc_pages() and page allocation failed for high order, the pages* may contain two different page shifts (high order and order-0). This could lead _vmappagesrangenoflush() to perform incorrect mappings, potentially resulting in memory corruption.
Users might encounter this as follows (vmapallowhuge = true, 2M is for PMD_SIZE):
kvmalloc(2M, _GFPNOFAIL|GFPX) _vmallocnoderangenoprof(vmflags=VMALLOWHUGEVMAP) vmareaallocpages(order=9) ---> order-9 allocation failed and fallback to order-0 vmappagesrange() vmappagesrangenoflush() _vmappagesrangenoflush(pageshift = 21) ----> wrong mapping happens
We can remove the fallback code because if a high-order allocation fails, _vmallocnoderangenoprof() will retry with order-0. Therefore, it is unnecessary to fallback to order-0 here. Therefore, fix this by removing the fallback code.(CVE-2024-45022)
In the Linux kernel, the following vulnerability has been resolved:
wifi: brcmfmac: cfg80211: Handle SSID based pmksa deletion
wpasupplicant 2.11 sends since 1efdba5fdc2c ("Handle PMKSA flush in the driver for SAE/OWE offload cases") SSID based PMKSA del commands. brcmfmac is not prepared and tries to dereference the NULL bssid and pmkid pointers in cfg80211pmksa. PMKID_V3 operations support SSID based updates so copy the SSID.(CVE-2024-46672)
In the Linux kernel, the following vulnerability has been resolved:
Bluetooth: btnxpuart: Fix random crash seen while removing driver
This fixes the random kernel crash seen while removing the driver, when running the load/unload test over multiple iterations.
1) modprobe btnxpuart 2) hciconfig hci0 reset 3) hciconfig (check hci0 interface up with valid BD address) 4) modprobe -r btnxpuart Repeat steps 1 to 4
The pswakeup() call in btnxpuartclose() schedules the psdata->work(), which gets scheduled after module is removed, causing a kernel crash.
This hidden issue got highlighted after enabling Power Save by default in 4183a7be7700 (Bluetooth: btnxpuart: Enable Power Save feature on startup)
The new pscleanup() deasserts UART break immediately while closing serdev device, cancels any scheduled pswork and destroys the ps_lock mutex.
[ 85.884604] Unable to handle kernel paging request at virtual address ffffd4a61638f258 [ 85.884624] Mem abort info: [ 85.884625] ESR = 0x0000000086000007 [ 85.884628] EC = 0x21: IABT (current EL), IL = 32 bits [ 85.884633] SET = 0, FnV = 0 [ 85.884636] EA = 0, S1PTW = 0 [ 85.884638] FSC = 0x07: level 3 translation fault [ 85.884642] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000041dd0000 [ 85.884646] [ffffd4a61638f258] pgd=1000000095fff003, p4d=1000000095fff003, pud=100000004823d003, pmd=100000004823e003, pte=0000000000000000 [ 85.884662] Internal error: Oops: 0000000086000007 [#1] PREEMPT SMP [ 85.890932] Modules linked in: algifhash algifskcipher afalg overlay fsljruio caamjr caamkeyblobdesc caamhashdesc caamalgdesc cryptoengine authenc libdes crct10difce polyvalce polyvalgeneric sndsocimxspdif sndsocimxcard sndsocak5558 sndsocak4458 caam secvio error sndsocfslspdif sndsocfslmicfil sndsocfslsai sndsocfslutils gpioirrecv rccore fuse [last unloaded: btnxpuart(O)] [ 85.927297] CPU: 1 PID: 67 Comm: kworker/1:3 Tainted: G O 6.1.36+g937b1be4345a #1 [ 85.936176] Hardware name: FSL i.MX8MM EVK board (DT) [ 85.936182] Workqueue: events 0xffffd4a61638f380 [ 85.936198] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 85.952817] pc : 0xffffd4a61638f258 [ 85.952823] lr : 0xffffd4a61638f258 [ 85.952827] sp : ffff8000084fbd70 [ 85.952829] x29: ffff8000084fbd70 x28: 0000000000000000 x27: 0000000000000000 [ 85.963112] x26: ffffd4a69133f000 x25: ffff4bf1c8540990 x24: ffff4bf215b87305 [ 85.963119] x23: ffff4bf215b87300 x22: ffff4bf1c85409d0 x21: ffff4bf1c8540970 [ 85.977382] x20: 0000000000000000 x19: ffff4bf1c8540880 x18: 0000000000000000 [ 85.977391] x17: 0000000000000000 x16: 0000000000000133 x15: 0000ffffe2217090 [ 85.977399] x14: 0000000000000001 x13: 0000000000000133 x12: 0000000000000139 [ 85.977407] x11: 0000000000000001 x10: 0000000000000a60 x9 : ffff8000084fbc50 [ 85.977417] x8 : ffff4bf215b7d000 x7 : ffff4bf215b83b40 x6 : 00000000000003e8 [ 85.977424] x5 : 00000000410fd030 x4 : 0000000000000000 x3 : 0000000000000000 [ 85.977432] x2 : 0000000000000000 x1 : ffff4bf1c4265880 x0 : 0000000000000000 [ 85.977443] Call trace: [ 85.977446] 0xffffd4a61638f258 [ 85.977451] 0xffffd4a61638f3e8 [ 85.977455] processonework+0x1d4/0x330 [ 85.977464] workerthread+0x6c/0x430 [ 85.977471] kthread+0x108/0x10c [ 85.977476] retfrom_fork+0x10/0x20 [ 85.977488] Code: bad PC value [ 85.977491] ---[ end trace 0000000000000000 ]---
Preset since v6.9.11(CVE-2024-46680)
In the Linux kernel, the following vulnerability has been resolved:
soc: qcom: pmic_glink: Fix race during initialization
As pointed out by Stephen Boyd it is possible that during initialization of the pmic_glink child drivers, the protection-domain notifiers fires, and the associated work is scheduled, before the client registration returns and as a result the local "client" pointer has been initialized.
The outcome of this is a NULL pointer dereference as the "client" pointer is blindly dereferenced.
Timeline provided by Stephen: CPU0 CPU1 ---- ---- ucsi->client = NULL; devmpmicglinkregisterclient() client->pdrnotify(client->priv, pg->clientstate) pmicglinkucsipdrnotify() schedulework(&ucsi->registerwork) <schedule away> pmicglinkucsiregister() ucsiregister() pmicglinkucsireadversion() pmicglinkucsiread() pmicglinkucsiread() pmicglinksend(ucsi->client) <client is NULL BAD> ucsi->client = client // Too late!
This code is identical across the altmode, battery manager and usci child drivers.
Resolve this by splitting the allocation of the "client" object and the registration thereof into two operations.
This only happens if the protection domain registry is populated at the time of registration, which by the introduction of commit '1ebcde047c54 ("soc: qcom: add pd-mapper implementation")' became much more likely.(CVE-2024-46693)
In the Linux kernel, the following vulnerability has been resolved:
drm/amd/display: avoid using null object of framebuffer
Instead of using state->fb->obj[0] directly, get object from framebuffer by calling drmgemfbgetobj() and return error code when object is null to avoid using null object of framebuffer.
(cherry picked from commit 73dd0ad9e5dad53766ea3e631303430116f834b3)(CVE-2024-46694)
In the Linux kernel, the following vulnerability has been resolved:
mptcp: pm: fix ID 0 endp usage after multiple re-creations
'localaddrused' and 'addaddraccepted' are decremented for addresses not related to the initial subflow (ID0), because the source and destination addresses of the initial subflows are known from the beginning: they don't count as "additional local address being used" or "ADD_ADDR being accepted".
It is then required not to increment them when the entrypoint used by the initial subflow is removed and re-added during a connection. Without this modification, this entrypoint cannot be removed and re-added more than once.(CVE-2024-46711)
In the Linux kernel, the following vulnerability has been resolved:
misc: fastrpc: Fix double free of 'buf' in error path
smatch warning: drivers/misc/fastrpc.c:1926 fastrpcreqmmap() error: double free of 'buf'
In fastrpcreqmmap() error path, the fastrpc buffer is freed in fastrpcreqmunmap_impl() if unmap is successful.
But in the end, there is an unconditional call to fastrpcbuffree(). So the above case triggers the double free of fastrpc buf.(CVE-2024-46741)
In the Linux kernel, the following vulnerability has been resolved:
net: hns3: void array out of bound when loop tnl_num
When query reg inf of SSU, it loops tnlnum times. However, tnlnum comes from hardware and the length of array is a fixed value. To void array out of bound, make sure the loop time is not greater than the length of array(CVE-2024-46833)
In the Linux kernel, the following vulnerability has been resolved:
mm: vmalloc: ensure vmap_block is initialised before adding to queue
Commit 8c61291fd850 ("mm: fix incorrect vbq reference in purgefragmentedblock") extended the 'vmap_block' structure to contain a 'cpu' field which is set at allocation time to the id of the initialising CPU.
When a new 'vmapblock' is being instantiated by newvmapblock(), the partially initialised structure is added to the local 'vmapblockqueue' xarray before the 'cpu' field has been initialised. If another CPU is concurrently walking the xarray (e.g. via vmunmap_aliases()), then it may perform an out-of-bounds access to the remote queue thanks to an uninitialised index.
This has been observed as UBSAN errors in Android:
| Internal error: UBSAN: array index out of bounds: 00000000f2005512 [#1] PREEMPT SMP | | Call trace: | purgefragmentedblock+0x204/0x21c | vmunmapaliases+0x170/0x378 | vmunmapaliases+0x1c/0x28 | changememorycommon+0x1dc/0x26c | setmemoryro+0x18/0x24 | moduleenablero+0x98/0x238 | doinit_module+0x1b0/0x310
Move the initialisation of 'vb->cpu' in newvmapblock() ahead of the addition to the xarray.(CVE-2024-46847)
In the Linux kernel, the following vulnerability has been resolved:
x86/hyperv: fix kexec crash due to VP assist page corruption
commit 9636be85cc5b ("x86/hyperv: Fix hypervpcpuinput_arg handling when CPUs go online/offline") introduces a new cpuhp state for hyperv initialization.
cpuhpsetupstate() returns the state number if state is CPUHPAPONLINEDYN or CPUHPBPPREPAREDYN and 0 for all other states. For the hyperv case, since a new cpuhp state was introduced it would return 0. However, in hvmachineshutdown(), the cpuhpremovestate() call is conditioned upon "hypervinitcpuhp > 0". This will never be true and so hvcpudie() won't be called on all CPUs. This means the VP assist page won't be reset. When the kexec kernel tries to setup the VP assist page again, the hypervisor corrupts the memory region of the old VP assist page causing a panic in case the kexec kernel is using that memory elsewhere. This was originally fixed in commit dfe94d4086e4 ("x86/hyperv: Fix kexec panic/hang issues").
Get rid of hypervinitcpuhp entirely since we are no longer using a dynamic cpuhp state and use CPUHPAPHYPERVONLINE directly with cpuhpremove_state().(CVE-2024-46864)
In the Linux kernel, the following vulnerability has been resolved:
fou: fix initialization of grc
The grc must be initialize first. There can be a condition where if fou is NULL, goto out will be executed and grc would be used uninitialized.(CVE-2024-46865)
In the Linux kernel, the following vulnerability has been resolved:
crypto: hisilicon/qm - inject error before stopping queue
The master ooo cannot be completely closed when the accelerator core reports memory error. Therefore, the driver needs to inject the qm error to close the master ooo. Currently, the qm error is injected after stopping queue, memory may be released immediately after stopping queue, causing the device to access the released memory. Therefore, error is injected to close master ooo before stopping queue to ensure that the device does not access the released memory.(CVE-2024-47730)
In the Linux kernel, the following vulnerability has been resolved:
RDMA/hns: Fix spinunlockirqrestore() called with IRQs enabled
Fix missuse of spinlockirq()/spinunlockirq() when spinlockirqsave()/spinlockirqrestore() was hold.
This was discovered through the lock debugging, and the corresponding log is as follows:
rawlocalirqrestore() called with IRQs enabled WARNING: CPU: 96 PID: 2074 at kernel/locking/irqflag-debug.c:10 warnbogusirqrestore+0x30/0x40 ... Call trace: warnbogusirqrestore+0x30/0x40 _rawspinunlockirqrestore+0x84/0xc8 addqptolist+0x11c/0x148 [hnsrocehwv2] hnsrocecreateqpcommon.constprop.0+0x240/0x780 [hnsrocehwv2] hnsrocecreateqp+0x98/0x160 [hnsrocehwv2] createqp+0x138/0x258 ibcreateqpkernel+0x50/0xe8 createmadqp+0xa8/0x128 ibmadportopen+0x218/0x448 ibmadinitdevice+0x70/0x1f8 addclientcontext+0xfc/0x220 enabledeviceandget+0xd0/0x140 ibregisterdevice.part.0+0xf4/0x1c8 ibregisterdevice+0x34/0x50 hnsroceregisterdevice+0x174/0x3d0 [hnsrocehwv2] hnsroceinit+0xfc/0x2c0 [hnsrocehwv2] _hnsrocehwv2initinstance+0x7c/0x1d0 [hnsrocehwv2] hnsrocehwv2initinstance+0x9c/0x180 hnsrocehwv2
In the Linux kernel, the following vulnerability has been resolved:
rxrpc: Fix a race between socket set up and I/O thread creation
In rxrpcopensocket(), it sets up the socket and then sets up the I/O thread that will handle it. This is a problem, however, as there's a gap between the two phases in which a packet may come into rxrpcencaprcv() from the UDP packet but we oops when trying to wake the not-yet created I/O thread.
As a quick fix, just make rxrpcencaprcv() discard the packet if there's no I/O thread yet.
A better, but more intrusive fix would perhaps be to rearrange things such that the socket creation is done by the I/O thread.(CVE-2024-49864)
In the Linux kernel, the following vulnerability has been resolved:
bpf: Fix a sdiv overflow issue
Zac Ecob reported a problem where a bpf program may cause kernel crash due to the following error: Oops: divide error: 0000 [#1] PREEMPT SMP KASAN PTI
The failure is due to the below signed divide: LLONGMIN/-1 where LLONGMIN equals to -9,223,372,036,854,775,808. LLONGMIN/-1 is supposed to give a positive number 9,223,372,036,854,775,808, but it is impossible since for 64-bit system, the maximum positive number is 9,223,372,036,854,775,807. On x8664, LLONGMIN/-1 will cause a kernel exception. On arm64, the result for LLONGMIN/-1 is LLONG_MIN.
Further investigation found all the following sdiv/smod cases may trigger an exception when bpf program is running on x8664 platform: - LLONGMIN/-1 for 64bit operation - INTMIN/-1 for 32bit operation - LLONGMIN%-1 for 64bit operation - INT_MIN%-1 for 32bit operation where -1 can be an immediate or in a register.
On arm64, there are no exceptions: - LLONGMIN/-1 = LLONGMIN - INTMIN/-1 = INTMIN - LLONGMIN%-1 = 0 - INTMIN%-1 = 0 where -1 can be an immediate or in a register.
Insn patching is needed to handle the above cases and the patched codes produced results aligned with above arm64 result. The below are pseudo codes to handle sdiv/smod exceptions including both divisor -1 and divisor 0 and the divisor is stored in a register.
sdiv: tmp = rX tmp += 1 /* [-1, 0] -> [0, 1] if tmp >(unsigned) 1 goto L2 if tmp == 0 goto L1 rY = 0 L1: rY = -rY; goto L3 L2: rY /= rX L3:
smod: tmp = rX tmp += 1 /* [-1, 0] -> [0, 1] if tmp >(unsigned) 1 goto L1 if tmp == 1 (is64 ? goto L2 : goto L3) rY = 0; goto L2 L1: rY %= rX L2: goto L4 // only when !is64 L3: wY = wY // only when !is64 L4:
[1] https://lore.kernel.org/bpf/tPJLTEh7SDxFEqAI2Ji5MBSoZVg7G-Py2iaZpAaWtM961fFTWtsnlzwvTbzBzaUzwQAoNATXKUlt0LZOFgnDcIyKCswAnAGdUF3LBrhGQ=@protonmail.com/(CVE-2024-49888)
In the Linux kernel, the following vulnerability has been resolved:
rcu-tasks: Fix access non-existent percpu rtpcp variable in rcutasksneed_gpcb()
For kernels built with CONFIGFORCENRCPUS=y, the nrcpuids is defined as NRCPUS instead of the number of possible cpus, this will cause the following system panic:
smpboot: Allowing 4 CPUs, 0 hotplug CPUs ... setuppercpu: NRCPUS:512 nrcpumaskbits:512 nrcpuids:512 nrnodeids:1 ... BUG: unable to handle page fault for address: ffffffff9911c8c8 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 0 PID: 15 Comm: rcutaskstrace Tainted: G W 6.6.21 #1 5dc7acf91a5e8e9ac9dcfc35bee0245691283ea6 RIP: 0010:rcutasksneedgpcb+0x25d/0x2c0 RSP: 0018:ffffa371c00a3e60 EFLAGS: 00010082 CR2: ffffffff9911c8c8 CR3: 000000040fa20005 CR4: 00000000001706f0 Call Trace: <TASK> ? _die+0x23/0x80 ? pagefaultoops+0xa4/0x180 ? excpagefault+0x152/0x180 ? asmexcpagefault+0x26/0x40 ? rcutasksneedgpcb+0x25d/0x2c0 ? _pfxrcutaskskthread+0x40/0x40 rcutasksonegp+0x69/0x180 rcutaskskthread+0x94/0xc0 kthread+0xe8/0x140 ? _pfxkthread+0x40/0x40 retfromfork+0x34/0x80 ? _pfxkthread+0x40/0x40 retfromforkasm+0x1b/0x80 </TASK>
Considering that there may be holes in the CPU numbers, use the maximum possible cpu number, instead of nrcpuids, for configuring enqueue and dequeue limits.
neeraj.upadhyay: Fix htmldocs build error reported by Stephen Rothwell
In the Linux kernel, the following vulnerability has been resolved:
wifi: rtw89: avoid to add interface to list twice when SER
If SER L2 occurs during the WoWLAN resume flow, the add interface flow is triggered by ieee80211reconfig(). However, due to rtw89wow_resume() return failure, it will cause the add interface flow to be executed again, resulting in a double add list and causing a kernel panic. Therefore, we have added a check to prevent double adding of the list.
listadd double add: new=ffff99d6992e2010, prev=ffff99d6992e2010, next=ffff99d695302628. ------------[ cut here ]------------ kernel BUG at lib/listdebug.c:37! invalid opcode: 0000 [#1] PREEMPT SMP NOPTI CPU: 0 PID: 9 Comm: kworker/0:1 Tainted: G W O 6.6.30-02659-gc18865c4dfbd #1 770df2933251a0e3c888ba69d1053a817a6376a7 Hardware name: HP Grunt/Grunt, BIOS GoogleGrunt.11031.169.0 06/24/2021 Workqueue: eventsfreezable ieee80211restartwork [mac80211] RIP: 0010:_listaddvalidorreport+0x5e/0xb0 Code: c7 74 18 48 39 ce 74 13 b0 01 59 5a 5e 5f 41 58 41 59 41 5a 5d e9 e2 d6 03 00 cc 48 c7 c7 8d 4f 17 83 48 89 c2 e8 02 c0 00 00 <0f> 0b 48 c7 c7 aa 8c 1c 83 e8 f4 bf 00 00 0f 0b 48 c7 c7 c8 bc 12 RSP: 0018:ffffa91b8007bc50 EFLAGS: 00010246 RAX: 0000000000000058 RBX: ffff99d6992e0900 RCX: a014d76c70ef3900 RDX: ffffa91b8007bae8 RSI: 00000000ffffdfff RDI: 0000000000000001 RBP: ffffa91b8007bc88 R08: 0000000000000000 R09: ffffa91b8007bae0 R10: 00000000ffffdfff R11: ffffffff83a79800 R12: ffff99d695302060 R13: ffff99d695300900 R14: ffff99d6992e1be0 R15: ffff99d6992e2010 FS: 0000000000000000(0000) GS:ffff99d6aac00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000078fbdba43480 CR3: 000000010e464000 CR4: 00000000001506f0 Call Trace: <TASK> ? _diebody+0x1f/0x70 ? die+0x3d/0x60 ? dotrap+0xa4/0x110 ? _listaddvalidorreport+0x5e/0xb0 ? doerrortrap+0x6d/0x90 ? _listaddvalidorreport+0x5e/0xb0 ? handleinvalidop+0x30/0x40 ? _listaddvalidorreport+0x5e/0xb0 ? excinvalidop+0x3c/0x50 ? asmexcinvalidop+0x16/0x20 ? _listaddvalidorreport+0x5e/0xb0 rtw89opsaddinterface+0x309/0x310 [rtw89core 7c32b1ee6854761c0321027c8a58c5160e41f48f] drvaddinterface+0x5c/0x130 [mac80211 83e989e6e616bd5b4b8a2b0a9f9352a2c385a3bc] ieee80211reconfig+0x241/0x13d0 [mac80211 83e989e6e616bd5b4b8a2b0a9f9352a2c385a3bc] ? finishwait+0x3e/0x90 ? synchronizercuexpedited+0x174/0x260 ? syncrcuexpdoneunlocked+0x50/0x50 ? wakebitfunction+0x40/0x40 ieee80211restartwork+0xf0/0x140 [mac80211 83e989e6e616bd5b4b8a2b0a9f9352a2c385a3bc] processscheduledworks+0x1e5/0x480 workerthread+0xea/0x1e0 kthread+0xdb/0x110 ? movelinkedworks+0x90/0x90 ? kthreadassociateblkcg+0xa0/0xa0 retfromfork+0x3b/0x50 ? kthreadassociateblkcg+0xa0/0xa0 retfromforkasm+0x11/0x20 </TASK> Modules linked in: dmintegrity asyncxor xor asynctx lz4 lz4compress zstd zstdcompress zram zsmalloc rfcomm cmac uinput algifhash algifskcipher afalg btusb btrtl iiotrighrtimer industrialioswtrigger btmtk industrialioconfigfs btbcm btintel uvcvideo videobuf2vmalloc iiotrigsysfs videobuf2memops videobuf2v4l2 videobuf2common uvc sndhdacodechdmi veth sndhdaintel sndinteldspcfg acpials sndhdacodec industrialiotriggeredbuffer kfifobuf sndhwdep industrialio i2cpiix4 sndhdacore designwarei2s ip6tablenat sndsocmax98357a xtMASQUERADE xtcgroup sndsocacprt5682mach fuse rtw898922ae(O) rtw898922a(O) rtw89pci(O) rtw89core(O) 8021q mac80211(O) bluetooth ecdhgeneric ecc cfg80211 r8152 mii joydev gsmi: Log Shutdown Reason 0x03 ---[ end trace 0000000000000000 ]---(CVE-2024-49939)
In the Linux kernel, the following vulnerability has been resolved:
ppp: do not assume bh is held in pppchannelbridge_input()
Networking receive path is usually handled from BH handler. However, some protocols need to acquire the socket lock, and packets might be stored in the socket backlog is the socket was owned by a user process.
In this case, releasesock(), _releasesock(), and skbacklogrcv() might call the sk->skbacklog_rcv() handler in process context.
sybot caught ppp was not considering this case in pppchannelbridge_input() :
WARNING: inconsistent lock state
inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. ksoftirqd/1/24 [HC0[0]:SC1[1]:HE1:SE0] takes: ffff0000db7f11e0 (&pch->downl){+.?.}-{2:2}, at: spinlock include/linux/spinlock.h:351 [inline] ffff0000db7f11e0 (&pch->downl){+.?.}-{2:2}, at: pppchannelbridgeinput drivers/net/ppp/pppgeneric.c:2272 [inline] ffff0000db7f11e0 (&pch->downl){+.?.}-{2:2}, at: pppinput+0x16c/0x854 drivers/net/ppp/pppgeneric.c:2304 {SOFTIRQ-ON-W} state was registered at: lockacquire+0x240/0x728 kernel/locking/lockdep.c:5759 _rawspinlock include/linux/spinlockapismp.h:133 [inline] _rawspinlock+0x48/0x60 kernel/locking/spinlock.c:154 spinlock include/linux/spinlock.h:351 [inline] pppchannelbridgeinput drivers/net/ppp/pppgeneric.c:2272 [inline] pppinput+0x16c/0x854 drivers/net/ppp/pppgeneric.c:2304 pppoercvcore+0xfc/0x314 drivers/net/ppp/pppoe.c:379 skbacklogrcv include/net/sock.h:1111 [inline] _releasesock+0x1a8/0x3d8 net/core/sock.c:3004 releasesock+0x68/0x1b8 net/core/sock.c:3558 pppoesendmsg+0xc8/0x5d8 drivers/net/ppp/pppoe.c:903 socksendmsgnosec net/socket.c:730 [inline] _socksendmsg net/socket.c:745 [inline] _syssendto+0x374/0x4f4 net/socket.c:2204 _dosyssendto net/socket.c:2216 [inline] _sesyssendto net/socket.c:2212 [inline] _arm64syssendto+0xd8/0xf8 net/socket.c:2212 _invokesyscall arch/arm64/kernel/syscall.c:35 [inline] invokesyscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49 el0svccommon+0x130/0x23c arch/arm64/kernel/syscall.c:132 doel0svc+0x48/0x58 arch/arm64/kernel/syscall.c:151 el0svc+0x54/0x168 arch/arm64/kernel/entry-common.c:712 el0t64synchandler+0x84/0xfc arch/arm64/kernel/entry-common.c:730 el0t64sync+0x190/0x194 arch/arm64/kernel/entry.S:598 irq event stamp: 282914 hardirqs last enabled at (282914): [<ffff80008b42e30c>] _rawspinunlockirqrestore include/linux/spinlockapismp.h:151 [inline] hardirqs last enabled at (282914): [<ffff80008b42e30c>] rawspinunlockirqrestore+0x38/0x98 kernel/locking/spinlock.c:194 hardirqs last disabled at (282913): [<ffff80008b42e13c>] _rawspinlockirqsave include/linux/spinlockapismp.h:108 [inline] hardirqs last disabled at (282913): [<ffff80008b42e13c>] rawspinlockirqsave+0x2c/0x7c kernel/locking/spinlock.c:162 softirqs last enabled at (282904): [<ffff8000801f8e88>] softirqhandleend kernel/softirq.c:400 [inline] softirqs last enabled at (282904): [<ffff8000801f8e88>] handlesoftirqs+0xa3c/0xbfc kernel/softirq.c:582 softirqs last disabled at (282909): [<ffff8000801fbdf8>] runksoftirqd+0x70/0x158 kernel/softirq.c:928
other info that might help us debug this: Possible unsafe locking scenario:
CPU0
----
lock(&pch->downl); <Interrupt> lock(&pch->downl);
* DEADLOCK *
1 lock held by ksoftirqd/1/24: #0: ffff80008f74dfa0 (rcureadlock){....}-{1:2}, at: rculockacquire+0x10/0x4c include/linux/rcupdate.h:325
stack backtrace: CPU: 1 UID: 0 PID: 24 Comm: ksoftirqd/1 Not tainted 6.11.0-rc7-syzkaller-g5f5673607153 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024 Call trace: dumpbacktrace+0x1b8/0x1e4 arch/arm64/kernel/stacktrace.c:319 showstack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:326 _dumpsta ---truncated---(CVE-2024-49946)
In the Linux kernel, the following vulnerability has been resolved:
Bluetooth: MGMT: Fix possible crash on mgmtindexremoved
If mgmtindexremoved is called while there are commands queued on cmd_sync it could lead to crashes like the bellow trace:
0x0000053D: _listdelentryvalidorreport+0x98/0xdc 0x0000053D: mgmtpendingremove+0x18/0x58 [bluetooth] 0x0000053E: mgmtremoveadvmonitorcomplete+0x80/0x108 [bluetooth] 0x0000053E: hcicmdsync_work+0xbc/0x164 [bluetooth]
So while handling mgmtindexremoved this attempts to dequeue commands passed as userdata to cmdsync.(CVE-2024-49951)
In the Linux kernel, the following vulnerability has been resolved:
net/mlx5e: Fix crash caused by calling _xfrmstate_delete() twice
The km.state is not checked in driver's delayed work. When xfrmstatecheckexpire() is called, the state can be reset to XFRMSTATEEXPIRED, even if it is XFRMSTATEDEAD already. This happens when xfrm state is deleted, but not freed yet. As _xfrmstatedelete() is called again in xfrm timer, the following crash occurs.
To fix this issue, skip xfrmstatecheckexpire() if km.state is not XFRMSTATE_VALID.
Oops: general protection fault, probably for non-canonical address 0xdead000000000108: 0000 [#1] SMP CPU: 5 UID: 0 PID: 7448 Comm: kworker/u102:2 Not tainted 6.11.0-rc2+ #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Workqueue: mlx5eipsec: eth%d mlx5eipsechandleswlimits [mlx5core] RIP: 0010:_xfrmstatedelete+0x3d/0x1b0 Code: 0f 84 8b 01 00 00 48 89 fd c6 87 c8 00 00 00 05 48 8d bb 40 10 00 00 e8 11 04 1a 00 48 8b 95 b8 00 00 00 48 8b 85 c0 00 00 00 <48> 89 42 08 48 89 10 48 8b 55 10 48 b8 00 01 00 00 00 00 ad de 48 RSP: 0018:ffff88885f945ec8 EFLAGS: 00010246 RAX: dead000000000122 RBX: ffffffff82afa940 RCX: 0000000000000036 RDX: dead000000000100 RSI: 0000000000000000 RDI: ffffffff82afb980 RBP: ffff888109a20340 R08: ffff88885f945ea0 R09: 0000000000000000 R10: 0000000000000000 R11: ffff88885f945ff8 R12: 0000000000000246 R13: ffff888109a20340 R14: ffff88885f95f420 R15: ffff88885f95f400 FS: 0000000000000000(0000) GS:ffff88885f940000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f2163102430 CR3: 00000001128d6001 CR4: 0000000000370eb0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <IRQ> ? dieaddr+0x33/0x90 ? excgeneralprotection+0x1a2/0x390 ? asmexcgeneralprotection+0x22/0x30 ? _xfrmstatedelete+0x3d/0x1b0 ? _xfrmstatedelete+0x2f/0x1b0 xfrmtimerhandler+0x174/0x350 ? _xfrmstatedelete+0x1b0/0x1b0 _hrtimerrunqueues+0x121/0x270 hrtimerrunsoftirq+0x88/0xd0 handlesoftirqs+0xcc/0x270 dosoftirq+0x3c/0x50 </IRQ> <TASK> _localbhenableip+0x47/0x50 mlx5eipsechandleswlimits+0x7d/0x90 [mlx5core] processonework+0x137/0x2d0 workerthread+0x28d/0x3a0 ? rescuerthread+0x480/0x480 kthread+0xb8/0xe0 ? kthreadpark+0x80/0x80 retfromfork+0x2d/0x50 ? kthreadpark+0x80/0x80 retfromfork_asm+0x11/0x20 </TASK>(CVE-2024-49953)
In the Linux kernel, the following vulnerability has been resolved:
bpftool: Fix undefined behavior in qsort(NULL, 0, ...)
When netfilter has no entry to display, qsort is called with qsort(NULL, 0, ...). This results in undefined behavior, as UBSan reports:
net.c:827:2: runtime error: null pointer passed as argument 1, which is declared to never be null
Although the C standard does not explicitly state whether calling qsort with a NULL pointer when the size is 0 constitutes undefined behavior, Section 7.1.4 of the C standard (Use of library functions) mentions:
"Each of the following statements applies unless explicitly stated otherwise in the detailed descriptions that follow: If an argument to a function has an invalid value (such as a value outside the domain of the function, or a pointer outside the address space of the program, or a null pointer, or a pointer to non-modifiable storage when the corresponding parameter is not const-qualified) or a type (after promotion) not expected by a function with variable number of arguments, the behavior is undefined."
To avoid this, add an early return when nflinkinfo is NULL to prevent calling qsort with a NULL pointer.(CVE-2024-49987)
In the Linux kernel, the following vulnerability has been resolved:
ksmbd: add refcnt to ksmbd_conn struct
When sending an oplock break request, opinfo->conn is used, But freed ->conn can be used on multichannel. This patch add a reference count to the ksmbd_conn struct so that it can be freed when it is no longer used.(CVE-2024-49988)
In the Linux kernel, the following vulnerability has been resolved:
net: dsa: improve shutdown sequence
Alexander Sverdlin presents 2 problems during shutdown with the lan9303 driver. One is specific to lan9303 and the other just happens to reproduce there.
The first problem is that lan9303 is unique among DSA drivers in that it calls devgetdrvdata() at "arbitrary runtime" (not probe, not shutdown, not remove):
phystatemachine() -> ... -> dsauserphyread() -> ds->ops->phyread() -> lan9303phyread() -> chip->ops->phyread() -> lan9303mdiophyread() -> devgetdrvdata()
But we never stop the phystatemachine(), so it may continue to run after dsaswitchshutdown(). Our common pattern in all DSA drivers is to set drvdata to NULL to suppress the remove() method that may come afterwards. But in this case it will result in an NPD.
The second problem is that the way in which we set dp->conduit->dsaptr = NULL; is concurrent with receive packet processing. dsaswitchrcv() checks once whether dev->dsaptr is NULL, but afterwards, rather than continuing to use that non-NULL value, dev->dsaptr is dereferenced again and again without NULL checks: dsaconduitfinduser() and many other places. In between dereferences, there is no locking to ensure that what was valid once continues to be valid.
Both problems have the common aspect that closing the conduit interface solves them.
In the first case, devclose(conduit) triggers the NETDEVGOINGDOWN event in dsausernetdeviceevent() which closes user ports as well. dsaportdisablert() calls phylinkstop(), which synchronously stops the phylink state machine, and ds->ops->phy_read() will thus no longer call into the driver after this point.
In the second case, dev_close(conduit) should do this, as per Documentation/networking/driver.rst:
| Quiescence | ---------- | | After the ndo_stop routine has been called, the hardware must | not receive or transmit any data. All in flight packets must | be aborted. If necessary, poll or wait for completion of | any reset commands.
So it should be sufficient to ensure that later, when we zeroize conduit->dsaptr, there will be no concurrent dsaswitch_rcv() call on this conduit.
The addition of the netifdevicedetach() function is to ensure that ioctls, rtnetlinks and ethtool requests on the user ports no longer propagate down to the driver - we're no longer prepared to handle them.
The race condition actually did not exist when commit 0650bf52b31f ("net: dsa: be compatible with masters which unregister on shutdown") first introduced dsaswitchshutdown(). It was created later, when we stopped unregistering the user interfaces from a bad spot, and we just replaced that sequence with a racy zeroization of conduit->dsa_ptr (one which doesn't ensure that the interfaces aren't up).(CVE-2024-49998)
In the Linux kernel, the following vulnerability has been resolved:
ppp: fix pppasyncencode() illegal access
syzbot reported an issue in pppasyncencode() [1]
In this case, pppoesendmsg() is called with a zero size. Then pppasync_encode() is called with an empty skb.
BUG: KMSAN: uninit-value in pppasyncencode drivers/net/ppp/pppasync.c:545 [inline] BUG: KMSAN: uninit-value in pppasyncpush+0xb4f/0x2660 drivers/net/ppp/pppasync.c:675 pppasyncencode drivers/net/ppp/pppasync.c:545 [inline] pppasyncpush+0xb4f/0x2660 drivers/net/ppp/pppasync.c:675 pppasyncsend+0x130/0x1b0 drivers/net/ppp/pppasync.c:634 pppchannelbridgeinput drivers/net/ppp/pppgeneric.c:2280 [inline] pppinput+0x1f1/0xe60 drivers/net/ppp/pppgeneric.c:2304 pppoercvcore+0x1d3/0x720 drivers/net/ppp/pppoe.c:379 skbacklogrcv+0x13b/0x420 include/net/sock.h:1113 releasesock+0x1da/0x330 net/core/sock.c:3072 releasesock+0x6b/0x250 net/core/sock.c:3626 pppoesendmsg+0x2b8/0xb90 drivers/net/ppp/pppoe.c:903 socksendmsgnosec net/socket.c:729 [inline] _socksendmsg+0x30f/0x380 net/socket.c:744 syssendmsg+0x903/0xb60 net/socket.c:2602 _syssendmsg+0x28d/0x3c0 net/socket.c:2656 _syssendmmsg+0x3c1/0x960 net/socket.c:2742 _dosyssendmmsg net/socket.c:2771 [inline] _sesyssendmmsg net/socket.c:2768 [inline] _x64syssendmmsg+0xbc/0x120 net/socket.c:2768 x64syscall+0xb6e/0x3ba0 arch/x86/include/generated/asm/syscalls64.h:308 dosyscallx64 arch/x86/entry/common.c:52 [inline] dosyscall64+0xcd/0x1e0 arch/x86/entry/common.c:83 entrySYSCALL64afterhwframe+0x77/0x7f
Uninit was created at: slabpostallochook mm/slub.c:4092 [inline] slaballocnode mm/slub.c:4135 [inline] kmemcacheallocnodenoprof+0x6bf/0xb80 mm/slub.c:4187 kmallocreserve+0x13d/0x4a0 net/core/skbuff.c:587 allocskb+0x363/0x7b0 net/core/skbuff.c:678 allocskb include/linux/skbuff.h:1322 [inline] sockwmalloc+0xfe/0x1a0 net/core/sock.c:2732 pppoesendmsg+0x3a7/0xb90 drivers/net/ppp/pppoe.c:867 socksendmsgnosec net/socket.c:729 [inline] _socksendmsg+0x30f/0x380 net/socket.c:744 syssendmsg+0x903/0xb60 net/socket.c:2602 _syssendmsg+0x28d/0x3c0 net/socket.c:2656 _syssendmmsg+0x3c1/0x960 net/socket.c:2742 _dosyssendmmsg net/socket.c:2771 [inline] _sesyssendmmsg net/socket.c:2768 [inline] _x64syssendmmsg+0xbc/0x120 net/socket.c:2768 x64syscall+0xb6e/0x3ba0 arch/x86/include/generated/asm/syscalls64.h:308 dosyscallx64 arch/x86/entry/common.c:52 [inline] dosyscall64+0xcd/0x1e0 arch/x86/entry/common.c:83 entrySYSCALL64afterhwframe+0x77/0x7f
CPU: 1 UID: 0 PID: 5411 Comm: syz.1.14 Not tainted 6.12.0-rc1-syzkaller-00165-g360c1f1f24c6 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024(CVE-2024-50035)
In the Linux kernel, the following vulnerability has been resolved:
Bluetooth: ISO: Fix multiple init when debugfs is disabled
If btdebugfs is not created successfully, which happens if either CONFIGDEBUGFS or CONFIGDEBUGFSALLOWALL is unset, then isoinit() returns early and does not set isoinited to true. This means that a subsequent call to isoinit() will result in duplicate calls to protoregister(), btsock_register(), etc.
With CONFIGLISTHARDENED and CONFIGBUGONDATACORRUPTION enabled, the duplicate call to proto_register() triggers this BUG():
listadd double add: new=ffffffffc0b280d0, prev=ffffffffbab56250, next=ffffffffc0b280d0. ------------[ cut here ]------------ kernel BUG at lib/listdebug.c:35! Oops: invalid opcode: 0000 [#1] PREEMPT SMP PTI CPU: 2 PID: 887 Comm: bluetoothd Not tainted 6.10.11-1-ao-desktop #1 RIP: 0010:_listaddvalidorreport+0x9a/0xa0 ... _listaddvalidorreport+0x9a/0xa0 protoregister+0x2b5/0x340 isoinit+0x23/0x150 [bluetooth] setisosocketfunc+0x68/0x1b0 [bluetooth] kmemcachefree+0x308/0x330 hcisocksendmsg+0x990/0x9e0 [bluetooth] _socksendmsg+0x7b/0x80 sockwriteiter+0x9a/0x110 doiterreadvwritev+0x11d/0x220 vfswritev+0x180/0x3e0 dowritev+0xca/0x100 ...
This change removes the early return. The check for isodebugfs being NULL was unnecessary, it is always NULL when isoinited is false.(CVE-2024-50077)
In the Linux kernel, the following vulnerability has been resolved:
nouveau/dmem: Fix vulnerability in migratetoram upon copy error
The nouveau_dmem_copy_one
function ensures that the copy push command is
sent to the device firmware but does not track whether it was executed
successfully.
In the case of a copy error (e.g., firmware or hardware failure), the
copy push command will be sent via the firmware channel, and
nouveau_dmem_copy_one
will likely report success, leading to the
migrate_to_ram
function returning a dirty HIGH_USER page to the user.
This can result in a security vulnerability, as a HIGH_USER page that may contain sensitive or corrupted data could be returned to the user.
To prevent this vulnerability, we allocate a zero page. Thus, in case of an error, a non-dirty (zero) page will be returned to the user.(CVE-2024-50096)
In the Linux kernel, the following vulnerability has been resolved:
xfrm: fix one more kernel-infoleak in algo dumping
During fuzz testing, the following issue was discovered:
BUG: KMSAN: kernel-infoleak in copytoiter+0x598/0x2a30 _copytoiter+0x598/0x2a30 _skbdatagramiter+0x168/0x1060 skbcopydatagramiter+0x5b/0x220 netlinkrecvmsg+0x362/0x1700 sockrecvmsg+0x2dc/0x390 _sysrecvfrom+0x381/0x6d0 _x64sysrecvfrom+0x130/0x200 x64syscall+0x32c8/0x3cc0 dosyscall64+0xd8/0x1c0 entrySYSCALL64afterhwframe+0x79/0x81
Uninit was stored to memory at: copytouserstateextra+0xcc1/0x1e00 dumponestate+0x28c/0x5f0 xfrmstatewalk+0x548/0x11e0 xfrmdumpsa+0x1e0/0x840 netlinkdump+0x943/0x1c40 netlinkdumpstart+0x746/0xdb0 xfrmuserrcvmsg+0x429/0xc00 netlinkrcvskb+0x613/0x780 xfrmnetlinkrcv+0x77/0xc0 netlinkunicast+0xe90/0x1280 netlinksendmsg+0x126d/0x1490 _socksendmsg+0x332/0x3d0 syssendmsg+0x863/0xc30 _syssendmsg+0x285/0x3e0 _x64syssendmsg+0x2d6/0x560 x64syscall+0x1316/0x3cc0 dosyscall64+0xd8/0x1c0 entrySYSCALL64after_hwframe+0x79/0x81
Uninit was created at: kmalloc+0x571/0xd30 attachauth+0x106/0x3e0 xfrmaddsa+0x2aa0/0x4230 xfrmuserrcvmsg+0x832/0xc00 netlinkrcvskb+0x613/0x780 xfrmnetlinkrcv+0x77/0xc0 netlinkunicast+0xe90/0x1280 netlinksendmsg+0x126d/0x1490 _socksendmsg+0x332/0x3d0 syssendmsg+0x863/0xc30 _syssendmsg+0x285/0x3e0 _x64syssendmsg+0x2d6/0x560 x64syscall+0x1316/0x3cc0 dosyscall64+0xd8/0x1c0 entrySYSCALL64after_hwframe+0x79/0x81
Bytes 328-379 of 732 are uninitialized Memory access of size 732 starts at ffff88800e18e000 Data copied to user address 00007ff30f48aff0
CPU: 2 PID: 18167 Comm: syz-executor.0 Not tainted 6.8.11 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
Fixes copying of xfrm algorithms where some random data of the structure fields can end up in userspace. Padding in structures may be filled with random (possibly sensitve) data and should never be given directly to user-space.
A similar issue was resolved in the commit 8222d5910dae ("xfrm: Zero padding when dumping algos and encap")
Found by Linux Verification Center (linuxtesting.org) with Syzkaller.(CVE-2024-50110)
In the Linux kernel, the following vulnerability has been resolved:
LoongArch: Enable IRQ if do_ale() triggered in irq-enabled context
Unaligned access exception can be triggered in irq-enabled context such as user mode, in this case doale() may call getuser() which may cause sleep. Then we will get:
BUG: sleeping function called from invalid context at arch/loongarch/kernel/access-helper.h:7 inatomic(): 0, irqsdisabled(): 1, nonblock: 0, pid: 129, name: modprobe preemptcount: 0, expected: 0 RCU nest depth: 0, expected: 0 CPU: 0 UID: 0 PID: 129 Comm: modprobe Tainted: G W 6.12.0-rc1+ #1723 Tainted: [W]=WARN Stack : 9000000105e0bd48 0000000000000000 9000000003803944 9000000105e08000 9000000105e0bc70 9000000105e0bc78 0000000000000000 0000000000000000 9000000105e0bc78 0000000000000001 9000000185e0ba07 9000000105e0b890 ffffffffffffffff 9000000105e0bc78 73924b81763be05b 9000000100194500 000000000000020c 000000000000000a 0000000000000000 0000000000000003 00000000000023f0 00000000000e1401 00000000072f8000 0000007ffbb0e260 0000000000000000 0000000000000000 9000000005437650 90000000055d5000 0000000000000000 0000000000000003 0000007ffbb0e1f0 0000000000000000 0000005567b00490 0000000000000000 9000000003803964 0000007ffbb0dfec 00000000000000b0 0000000000000007 0000000000000003 0000000000071c1d ... Call Trace: [<9000000003803964>] showstack+0x64/0x1a0 [<9000000004c57464>] dumpstacklvl+0x74/0xb0 [<9000000003861ab4>] _mightresched+0x154/0x1a0 [<900000000380c96c>] emulateloadstoreinsn+0x6c/0xf60 [<9000000004c58118>] doale+0x78/0x180 [<9000000003801bc8>] handleale+0x128/0x1e0
So enable IRQ if unaligned access exception is triggered in irq-enabled context to fix it.(CVE-2024-50111)
In the Linux kernel, the following vulnerability has been resolved:
net/mlx5: Unregister notifier on eswitch init failure
It otherwise remains registered and a subsequent attempt at eswitch enabling might trigger warnings of the sort:
[ 682.589148] ------------[ cut here ]------------ [ 682.590204] notifier callback eswitchvportevent [mlx5core] already registered [ 682.590256] WARNING: CPU: 13 PID: 2660 at kernel/notifier.c:31 notifierchainregister+0x3e/0x90 [...snipped] [ 682.610052] Call Trace: [ 682.610369] <TASK> [ 682.610663] ? _warn+0x7c/0x110 [ 682.611050] ? notifierchainregister+0x3e/0x90 [ 682.611556] ? reportbug+0x148/0x170 [ 682.611977] ? handlebug+0x36/0x70 [ 682.612384] ? excinvalidop+0x13/0x60 [ 682.612817] ? asmexcinvalidop+0x16/0x20 [ 682.613284] ? notifierchainregister+0x3e/0x90 [ 682.613789] atomicnotifierchainregister+0x25/0x40 [ 682.614322] mlx5eswitchenablelocked+0x1d4/0x3b0 [mlx5core] [ 682.614965] mlx5eswitchenable+0xc9/0x100 [mlx5core] [ 682.615551] mlx5deviceenablesriov+0x25/0x340 [mlx5core] [ 682.616170] mlx5coresriovconfigure+0x50/0x170 [mlx5core] [ 682.616789] sriovnumvfsstore+0xb0/0x1b0 [ 682.617248] kernfsfopwriteiter+0x117/0x1a0 [ 682.617734] vfswrite+0x231/0x3f0 [ 682.618138] ksyswrite+0x63/0xe0 [ 682.618536] dosyscall64+0x4c/0x100 [ 682.618958] entrySYSCALL64afterhwframe+0x4b/0x53(CVE-2024-50136)
In the Linux kernel, the following vulnerability has been resolved:
net/mlx5: Fix command bitmask initialization
Command bitmask have a dedicated bit for MANAGEPAGES command, this bit isn't Initialize during command bitmask Initialization, only during MANAGEPAGES.
In addition, mlx5cmdtriggercompletions() is trying to trigger completion for MANAGEPAGES command as well.
Hence, in case health error occurred before any MANAGEPAGES command have been invoke (for example, during mlx5enablehca()), mlx5cmdtriggercompletions() will try to trigger completion for MANAGE_PAGES command, which will result in null-ptr-deref error.[1]
Fix it by Initialize command bitmask correctly.
While at it, re-write the code for better understanding.
[1] BUG: KASAN: null-ptr-deref in mlx5cmdtriggercompletions+0x1db/0x600 [mlx5core] Write of size 4 at addr 0000000000000214 by task kworker/u96:2/12078 CPU: 10 PID: 12078 Comm: kworker/u96:2 Not tainted 6.9.0-rc2forupstreamdebug202404071901 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Workqueue: mlx5health0000:08:00.0 mlx5fwfatalreportererrwork [mlx5core] Call Trace: <TASK> dumpstacklvl+0x7e/0xc0 kasanreport+0xb9/0xf0 kasancheckrange+0xec/0x190 mlx5cmdtriggercompletions+0x1db/0x600 [mlx5core] mlx5cmdflush+0x94/0x240 [mlx5core] entererrorstate+0x6c/0xd0 [mlx5core] mlx5fwfatalreportererrwork+0xf3/0x480 [mlx5core] processonework+0x787/0x1490 ? lockdephardirqsonprepare+0x400/0x400 ? pwqdecnrinflight+0xda0/0xda0 ? assignwork+0x168/0x240 workerthread+0x586/0xd30 ? rescuerthread+0xae0/0xae0 kthread+0x2df/0x3b0 ? kthreadcompleteandexit+0x20/0x20 retfromfork+0x2d/0x70 ? kthreadcompleteandexit+0x20/0x20 retfromfork_asm+0x11/0x20 </TASK>(CVE-2024-50147)
In the Linux kernel, the following vulnerability has been resolved:
ALSA: hda/cs8409: Fix possible NULL dereference
If sndhdagenaddkctl fails to allocate memory and returns NULL, then NULL pointer dereference will occur in the next line.
Since dolphinfixups function is a hdafixup function which is not supposed to return any errors, add simple check before dereference, ignore the fail.
Found by Linux Verification Center (linuxtesting.org) with SVACE.(CVE-2024-50160)
In the Linux kernel, the following vulnerability has been resolved:
media: qcom: camss: Remove usecount guard in stopstreaming
The use_count check was introduced so that multiple concurrent Raw Data Interfaces RDIs could be driven by different virtual channels VCs on the CSIPHY input driving the video pipeline.
This is an invalid use of usecount though as usecount pertains to the number of times a video entity has been opened by user-space not the number of active streams.
If usecount and stream-on count don't agree then stopstreaming() will break as is currently the case and has become apparent when using CAMSS with libcamera's released softisp 0.3.
The use of use_count like this is a bit hacky and right now breaks regular usage of CAMSS for a single stream case. Stopping qcam results in the splat below, and then it cannot be started again and any attempts to do so fails with -EBUSY.
[ 1265.509831] WARNING: CPU: 5 PID: 919 at drivers/media/common/videobuf2/videobuf2-core.c:2183 _vb2queuecancel+0x230/0x2c8 [videobuf2common] ... [ 1265.510630] Call trace: [ 1265.510636] _vb2queuecancel+0x230/0x2c8 [videobuf2common] [ 1265.510648] vb2corestreamoff+0x24/0xcc [videobuf2common] [ 1265.510660] vb2ioctlstreamoff+0x5c/0xa8 [videobuf2v4l2] [ 1265.510673] v4lstreamoff+0x24/0x30 [videodev] [ 1265.510707] _videodoioctl+0x190/0x3f4 [videodev] [ 1265.510732] videousercopy+0x304/0x8c4 [videodev] [ 1265.510757] videoioctl2+0x18/0x34 [videodev] [ 1265.510782] v4l2ioctl+0x40/0x60 [videodev] ... [ 1265.510944] videobuf2common: driver bug: stopstreaming operation is leaving buffer 0 in active state [ 1265.511175] videobuf2common: driver bug: stopstreaming operation is leaving buffer 1 in active state [ 1265.511398] videobuf2common: driver bug: stop_streaming operation is leaving buffer 2 in active st
One CAMSS specific way to handle multiple VCs on the same RDI might be:
Either way refusing to release video buffers based on use_count is erroneous and should be reverted. The silicon enabling code for selecting VCs is perfectly fine. Its a "known missing feature" that concurrent VCs won't work with CAMSS right now.
Initial testing with this code didn't show an error but, SoftISP and "real" usage with Google Hangouts breaks the upstream code pretty quickly, we need to do a partial revert and take another pass at VCs.
This commit partially reverts commit 89013969e232 ("media: camss: sm8250: Pipeline starting and stopping for multiple virtual channels")(CVE-2024-50175)
In the Linux kernel, the following vulnerability has been resolved:
remoteproc: k3-r5: Fix error handling when power-up failed
By simply bailing out, the driver was violating its rule and internal assumptions that either both or no rproc should be initialized. E.g., this could cause the first core to be available but not the second one, leading to crashes on its shutdown later on while trying to dereference that second instance.(CVE-2024-50176)
In the Linux kernel, the following vulnerability has been resolved:
clk: imx: Remove CLKSETPARENT_GATE for DRAM mux for i.MX7D
For i.MX7D DRAM related mux clock, the clock source change should ONLY be done done in low level asm code without accessing DRAM, and then calling clk API to sync the HW clock status with clk tree, it should never touch real clock source switch via clk API, so CLKSETPARENT_GATE flag should NOT be added, otherwise, DRAM's clock parent will be disabled when DRAM is active, and system will hang.(CVE-2024-50181)
In the Linux kernel, the following vulnerability has been resolved:
scsi: lpfc: Ensure DA_ID handling completion before deleting an NPIV instance
Deleting an NPIV instance requires all fabric ndlps to be released before an NPIV's resources can be torn down. Failure to release fabric ndlps beforehand opens kref imbalance race conditions. Fix by forcing the DAID to complete synchronously with usage of waitqueue.(CVE-2024-50183)
In the Linux kernel, the following vulnerability has been resolved:
HID: amdsfh: Switch to device-managed dmamalloc_coherent()
Using the device-managed version allows to simplify clean-up in probe() error path.
Additionally, this device-managed ensures proper cleanup, which helps to resolve memory errors, page faults, btrfs going read-only, and btrfs disk corruption.(CVE-2024-50189)
In the Linux kernel, the following vulnerability has been resolved:
fork: do not invoke uffd on fork if error occurs
Patch series "fork: do not expose incomplete mm on fork".
During fork we may place the virtual memory address space into an inconsistent state before the fork operation is complete.
In addition, we may encounter an error during the fork operation that indicates that the virtual memory address space is invalidated.
As a result, we should not be exposing it in any way to external machinery that might interact with the mm or VMAs, machinery that is not designed to deal with incomplete state.
We specifically update the fork logic to defer khugepaged and ksm to the end of the operation and only to be invoked if no error arose, and disallow uffd from observing fork events should an error have occurred.
This patch (of 2):
Currently on fork we expose the virtual address space of a process to userland unconditionally if uffd is registered in VMAs, regardless of whether an error arose in the fork.
This is performed in dupuserfaultfdcomplete() which is invoked unconditionally, and performs two duties - invoking registered handlers for the UFFDEVENTFORK event via dupfctx(), and clearing down userfaultfdforkctx objects established in dupuserfaultfd().
This is problematic, because the virtual address space may not yet be correctly initialised if an error arose.
The change in commit d24062914837 ("fork: use _mtdup() to duplicate maple tree in dup_mmap()") makes this more pertinent as we may be in a state where entries in the maple tree are not yet consistent.
We address this by, on fork error, ensuring that we roll back state that we would otherwise expect to clean up through the event being handled by userland and perform the memory freeing duty otherwise performed by dupuserfaultfdcomplete().
We do this by implementing a new function, dupuserfaultfdfail(), which performs the same loop, only decrementing reference counts.
Note that we perform mmgrab() on the parent and child mm's, however userfaultfdctxput() will mmdrop() this once the reference count drops to zero, so we will avoid memory leaks correctly here.(CVE-2024-50220)
In the Linux kernel, the following vulnerability has been resolved:
drm/amd/pm: Vangogh: Fix kernel memory out of bounds write
KASAN reports that the GPU metrics table allocated in vangoghtablesinit() is not large enough for the memset done in smucmninitsoftgpu_metrics(). Condensed report follows:
[ 33.861314] BUG: KASAN: slab-out-of-bounds in smucmninitsoftgpumetrics+0x73/0x200 [amdgpu] [ 33.861799] Write of size 168 at addr ffff888129f59500 by task mangoapp/1067 ... [ 33.861808] CPU: 6 UID: 1000 PID: 1067 Comm: mangoapp Tainted: G W 6.12.0-rc4 #356 1a56f59a8b5182eeaf67eb7cb8b13594dd23b544 [ 33.861816] Tainted: [W]=WARN [ 33.861818] Hardware name: Valve Galileo/Galileo, BIOS F7G0107 12/01/2023 [ 33.861822] Call Trace: [ 33.861826] <TASK> [ 33.861829] dumpstacklvl+0x66/0x90 [ 33.861838] printreport+0xce/0x620 [ 33.861853] kasanreport+0xda/0x110 [ 33.862794] kasancheckrange+0xfd/0x1a0 [ 33.862799] _asanmemset+0x23/0x40 [ 33.862803] smucmninitsoftgpumetrics+0x73/0x200 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779] [ 33.863306] vangoghgetgpumetricsv24+0x123/0xad0 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779] [ 33.864257] vangoghcommongetgpumetrics+0xb0c/0xbc0 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779] [ 33.865682] amdgpudpmgetgpumetrics+0xcc/0x110 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779] [ 33.866160] amdgpugetgpumetrics+0x154/0x2d0 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779] [ 33.867135] devattrshow+0x43/0xc0 [ 33.867147] sysfskfseqshow+0x1f1/0x3b0 [ 33.867155] seqreaditer+0x3f8/0x1140 [ 33.867173] vfsread+0x76c/0xc50 [ 33.867198] ksysread+0xfb/0x1d0 [ 33.867214] dosyscall64+0x90/0x160 ... [ 33.867353] Allocated by task 378 on cpu 7 at 22.794876s: [ 33.867358] kasansavestack+0x33/0x50 [ 33.867364] kasansavetrack+0x17/0x60 [ 33.867367] _kasankmalloc+0x87/0x90 [ 33.867371] vangoghinitsmctables+0x3f9/0x840 [amdgpu] [ 33.867835] smuswinit+0xa32/0x1850 [amdgpu] [ 33.868299] amdgpudeviceinit+0x467b/0x8d90 [amdgpu] [ 33.868733] amdgpudriverloadkms+0x19/0xf0 [amdgpu] [ 33.869167] amdgpupciprobe+0x2d6/0xcd0 [amdgpu] [ 33.869608] localpciprobe+0xda/0x180 [ 33.869614] pcidevice_probe+0x43f/0x6b0
Empirically we can confirm that the former allocates 152 bytes for the table, while the latter memsets the 168 large block.
Root cause appears that when GPU metrics tables for v2_4 parts were added it was not considered to enlarge the table to fit.
The fix in this patch is rather "brute force" and perhaps later should be done in a smarter way, by extracting and consolidating the part version to size logic to a common helper, instead of brute forcing the largest possible allocation. Nevertheless, for now this works and fixes the out of bounds write.
v2: * Drop impossible v3_0 case. (Mario)
(cherry picked from commit 0880f58f9609f0200483a49429af0f050d281703)(CVE-2024-50221)
In the Linux kernel, the following vulnerability has been resolved:
iio: gts-helper: Fix memory leaks in iiogtsbuildavailscale_table()
modprobe iio-test-gts and rmmod it, then the following memory leak occurs:
unreferenced object 0xffffff80c810be00 (size 64):
comm "kunit_try_catch", pid 1654, jiffies 4294913981
hex dump (first 32 bytes):
02 00 00 00 08 00 00 00 20 00 00 00 40 00 00 00 ........ ...@...
80 00 00 00 00 02 00 00 00 04 00 00 00 08 00 00 ................
backtrace (crc a63d875e):
[<0000000028c1b3c2>] kmemleak_alloc+0x34/0x40
[<000000001d6ecc87>] __kmalloc_noprof+0x2bc/0x3c0
[<00000000393795c1>] devm_iio_init_iio_gts+0x4b4/0x16f4
[<0000000071bb4b09>] 0xffffffdf052a62e0
[<000000000315bc18>] 0xffffffdf052a6488
[<00000000f9dc55b5>] kunit_try_run_case+0x13c/0x3ac
[<00000000175a3fd4>] kunit_generic_run_threadfn_adapter+0x80/0xec
[<00000000f505065d>] kthread+0x2e8/0x374
[<00000000bbfb0e5d>] ret_from_fork+0x10/0x20
unreferenced object 0xffffff80cbfe9e70 (size 16):
comm "kunit_try_catch", pid 1658, jiffies 4294914015
hex dump (first 16 bytes):
10 00 00 00 40 00 00 00 80 00 00 00 00 00 00 00 ....@...........
backtrace (crc 857f0cb4):
[<0000000028c1b3c2>] kmemleak_alloc+0x34/0x40
[<000000001d6ecc87>] __kmalloc_noprof+0x2bc/0x3c0
[<00000000393795c1>] devm_iio_init_iio_gts+0x4b4/0x16f4
[<0000000071bb4b09>] 0xffffffdf052a62e0
[<000000007d089d45>] 0xffffffdf052a6864
[<00000000f9dc55b5>] kunit_try_run_case+0x13c/0x3ac
[<00000000175a3fd4>] kunit_generic_run_threadfn_adapter+0x80/0xec
[<00000000f505065d>] kthread+0x2e8/0x374
[<00000000bbfb0e5d>] ret_from_fork+0x10/0x20
......
It includes 55 times "size 64" memory leaks, which correspond to 5 times test_init_iio_gain_scale() calls with gts_test_gains size 10 (10size(int)) and gtstestitimes size 5. It also includes 51 times "size 16" memory leak, which correspond to one time __test_init_iio_gain_scale() call with gts_test_gains_gain_low size 3 (3size(int)) and gtstestitimes size 5.
The reason is that the pertimegains[i] is not freed which is allocated in the "gts->numitime" for loop in iiogtsbuildavailscaletable().(CVE-2024-50231)
In the Linux kernel, the following vulnerability has been resolved:
iio: adc: ad7124: fix division by zero in ad7124setchannel_odr()
In the ad7124writeraw() function, parameter val can potentially be zero. This may lead to a division by zero when DIVROUNDCLOSEST() is called within ad7124setchannelodr(). The ad7124writeraw() function is invoked through the sequence: iiowritechannelraw() -> iiowritechannelattribute() -> iiochannel_write(), with no checks in place to ensure val is non-zero.(CVE-2024-50232)
In the Linux kernel, the following vulnerability has been resolved:
phy: qcom: qmp-usb: fix NULL-deref on runtime suspend
Commit 413db06c05e7 ("phy: qcom-qmp-usb: clean up probe initialisation") removed most users of the platform device driver data, but mistakenly also removed the initialisation despite the data still being used in the runtime PM callbacks.
Restore the driver data initialisation at probe to avoid a NULL-pointer dereference on runtime suspend.
Apparently no one uses runtime PM, which currently needs to be enabled manually through sysfs, with this driver.(CVE-2024-50240)
In the Linux kernel, the following vulnerability has been resolved:
mlxsw: spectrum_ipip: Fix memory leak when changing remote IPv6 address
The device stores IPv6 addresses that are used for encapsulation in linear memory that is managed by the driver.
Changing the remote address of an ip6gre net device never worked properly, but since cited commit the following reproducer [1] would result in a warning [2] and a memory leak [3]. The problem is that the new remote address is never added by the driver to its hash table (and therefore the device) and the old address is never removed from it.
Fix by programming the new address when the configuration of the ip6gre net device changes and removing the old one. If the address did not change, then the above would result in increasing the reference count of the address and then decreasing it.
[1] # ip link add name bla up type ip6gre local 2001:db8:1::1 remote 2001:db8:2::1 tos inherit ttl inherit # ip link set dev bla type ip6gre remote 2001:db8:3::1 # ip link del dev bla # devlink dev reload pci/0000:01:00.0
[2] WARNING: CPU: 0 PID: 1682 at drivers/net/ethernet/mellanox/mlxsw/spectrum.c:3002 mlxswspipv6addrput+0x140/0x1d0 Modules linked in: CPU: 0 UID: 0 PID: 1682 Comm: ip Not tainted 6.12.0-rc3-custom-g86b5b55bc835 #151 Hardware name: Nvidia SN5600/VMOD0013, BIOS 5.13 05/31/2023 RIP: 0010:mlxswspipv6addrput+0x140/0x1d0 [...] Call Trace: <TASK> mlxswsprouternetdeviceevent+0x55f/0x1240 notifiercallchain+0x5a/0xd0 callnetdevicenotifiersinfo+0x39/0x90 unregisternetdevicemanynotify+0x63e/0x9d0 rtnldellink+0x16b/0x3a0 rtnetlinkrcvmsg+0x142/0x3f0 netlinkrcvskb+0x50/0x100 netlinkunicast+0x242/0x390 netlinksendmsg+0x1de/0x420 syssendmsg+0x2bd/0x320 syssendmsg+0x9a/0xe0 _syssendmsg+0x7a/0xd0 dosyscall64+0x9e/0x1a0 entrySYSCALL64afterhwframe+0x77/0x7f
[3] unreferenced object 0xffff898081f597a0 (size 32): comm "ip", pid 1626, jiffies 4294719324 hex dump (first 32 bytes): 20 01 0d b8 00 02 00 00 00 00 00 00 00 00 00 01 ............... 21 49 61 83 80 89 ff ff 00 00 00 00 01 00 00 00 !Ia............. backtrace (crc fd9be911): [<00000000df89c55d>] kmalloccachenoprof+0x1da/0x260 [<00000000ff2a1ddb>] mlxswspipv6addrkvdlindexget+0x281/0x340 [<000000009ddd445d>] mlxswsprouternetdeviceevent+0x47b/0x1240 [<00000000743e7757>] notifiercallchain+0x5a/0xd0 [<000000007c7b9e13>] callnetdevicenotifiersinfo+0x39/0x90 [<000000002509645d>] registernetdevice+0x5f7/0x7a0 [<00000000c2e7d2a9>] ip6grenewlinkcommon.isra.0+0x65/0x130 [<0000000087cd6d8d>] ip6grenewlink+0x72/0x120 [<000000004df7c7cc>] rtnlnewlink+0x471/0xa20 [<0000000057ed632a>] rtnetlinkrcvmsg+0x142/0x3f0 [<0000000032e0d5b5>] netlinkrcvskb+0x50/0x100 [<00000000908bca63>] netlinkunicast+0x242/0x390 [<00000000cdbe1c87>] netlinksendmsg+0x1de/0x420 [<0000000011db153e>] syssendmsg+0x2bd/0x320 [<000000003b6d53eb>] _syssendmsg+0x9a/0xe0 [<00000000cae27c62>] _syssendmsg+0x7a/0xd0(CVE-2024-50252)
In the Linux kernel, the following vulnerability has been resolved:
netfilter: nfrejectipv6: fix potential crash in nfsendreset6()
I got a syzbot report without a repro [1] crashing in nfsendreset6()
I think the issue is that dev->hardheaderlen is zero, and we attempt later to push an Ethernet header.
Use LLMAXHEADER, as other functions in net/ipv6/netfilter/nfrejectipv6.c.
[1]
skbuff: skbunderpanic: text:ffffffff89b1d008 len:74 put:14 head:ffff88803123aa00 data:ffff88803123a9f2 tail:0x3c end:0x140 dev:syztun kernel BUG at net/core/skbuff.c:206 ! Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI CPU: 0 UID: 0 PID: 7373 Comm: syz.1.568 Not tainted 6.12.0-rc2-syzkaller-00631-g6d858708d465 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 RIP: 0010:skbpanic net/core/skbuff.c:206 [inline] RIP: 0010:skbunderpanic+0x14b/0x150 net/core/skbuff.c:216 Code: 0d 8d 48 c7 c6 60 a6 29 8e 48 8b 54 24 08 8b 0c 24 44 8b 44 24 04 4d 89 e9 50 41 54 41 57 41 56 e8 ba 30 38 02 48 83 c4 20 90 <0f> 0b 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 RSP: 0018:ffffc900045269b0 EFLAGS: 00010282 RAX: 0000000000000088 RBX: dffffc0000000000 RCX: cd66dacdc5d8e800 RDX: 0000000000000000 RSI: 0000000000000200 RDI: 0000000000000000 RBP: ffff88802d39a3d0 R08: ffffffff8174afec R09: 1ffff920008a4ccc R10: dffffc0000000000 R11: fffff520008a4ccd R12: 0000000000000140 R13: ffff88803123aa00 R14: ffff88803123a9f2 R15: 000000000000003c FS: 00007fdbee5ff6c0(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000005d322000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> skbpush+0xe5/0x100 net/core/skbuff.c:2636 ethheader+0x38/0x1f0 net/ethernet/eth.c:83 devhardheader include/linux/netdevice.h:3208 [inline] nfsendreset6+0xce6/0x1270 net/ipv6/netfilter/nfrejectipv6.c:358 nftrejectineteval+0x3b9/0x690 net/netfilter/nftrejectinet.c:48 exprcallopseval net/netfilter/nftablescore.c:240 [inline] nftdochain+0x4ad/0x1da0 net/netfilter/nftablescore.c:288 nftdochaininet+0x418/0x6b0 net/netfilter/nftchainfilter.c:161 nfhookentryhookfn include/linux/netfilter.h:154 [inline] nfhookslow+0xc3/0x220 net/netfilter/core.c:626 nfhook include/linux/netfilter.h:269 [inline] NFHOOK include/linux/netfilter.h:312 [inline] brnfpreroutingipv6+0x63e/0x770 net/bridge/brnetfilteripv6.c:184 nfhookentryhookfn include/linux/netfilter.h:154 [inline] nfhookbridgepre net/bridge/brinput.c:277 [inline] brhandleframe+0x9fd/0x1530 net/bridge/brinput.c:424 _netifreceiveskbcore+0x13e8/0x4570 net/core/dev.c:5562 _netifreceiveskbonecore net/core/dev.c:5666 [inline] _netifreceiveskb+0x12f/0x650 net/core/dev.c:5781 netifreceiveskbinternal net/core/dev.c:5867 [inline] netifreceiveskb+0x1e8/0x890 net/core/dev.c:5926 tunrxbatched+0x1b7/0x8f0 drivers/net/tun.c:1550 tungetuser+0x3056/0x47e0 drivers/net/tun.c:2007 tunchrwriteiter+0x10d/0x1f0 drivers/net/tun.c:2053 newsyncwrite fs/readwrite.c:590 [inline] vfswrite+0xa6d/0xc90 fs/readwrite.c:683 ksyswrite+0x183/0x2b0 fs/readwrite.c:736 dosyscallx64 arch/x86/entry/common.c:52 [inline] dosyscall64+0xf3/0x230 arch/x86/entry/common.c:83 entrySYSCALL64afterhwframe+0x77/0x7f RIP: 0033:0x7fdbeeb7d1ff Code: 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 c9 8d 02 00 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 44 24 08 e8 1c 8e 02 00 48 RSP: 002b:00007fdbee5ff000 EFLAGS: 00000293 ORIGRAX: 0000000000000001 RAX: ffffffffffffffda RBX: 00007fdbeed36058 RCX: 00007fdbeeb7d1ff RDX: 000000000000008e RSI: 0000000020000040 RDI: 00000000000000c8 RBP: 00007fdbeebf12be R08: 0000000 ---truncated---(CVE-2024-50256)
In the Linux kernel, the following vulnerability has been resolved:
net: arc: fix the device for dmamapsingle/dmaunmapsingle
The ndev->dev and pdev->dev aren't the same device, use ndev->dev.parent which has dma_mask, ndev->dev.parent is just pdev->dev. Or it would cause the following issue:
[ 39.933526] ------------[ cut here ]------------ [ 39.938414] WARNING: CPU: 1 PID: 501 at kernel/dma/mapping.c:149 dmamappage_attrs+0x90/0x1f8(CVE-2024-50295)
In the Linux kernel, the following vulnerability has been resolved:
net: hns3: fix kernel crash when uninstalling driver
When the driver is uninstalled and the VF is disabled concurrently, a kernel crash occurs. The reason is that the two actions call function pcidisablesriov(). The numVFs is checked to determine whether to release the corresponding resources. During the second calling, numVFs is not 0 and the resource release function is called. However, the corresponding resource has been released during the first invoking. Therefore, the problem occurs:
[15277.839633][T50670] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020 ... [15278.131557][T50670] Call trace: [15278.134686][T50670] klistput+0x28/0x12c [15278.138682][T50670] klistdel+0x14/0x20 [15278.142592][T50670] devicedel+0xbc/0x3c0 [15278.146676][T50670] pciremovebusdevice+0x84/0x120 [15278.151714][T50670] pcistopandremovebusdevice+0x6c/0x80 [15278.157447][T50670] pciiovremovevirtfn+0xb4/0x12c [15278.162485][T50670] sriovdisable+0x50/0x11c [15278.166829][T50670] pcidisablesriov+0x24/0x30 [15278.171433][T50670] hnae3unregisteraealgoprepare+0x60/0x90 [hnae3] [15278.178039][T50670] hclgeexit+0x28/0xd0 [hclge] [15278.182730][T50670] _sesysdeletemodule.isra.0+0x164/0x230 [15278.188550][T50670] _arm64sysdeletemodule+0x1c/0x30 [15278.193848][T50670] invokesyscall+0x50/0x11c [15278.198278][T50670] el0svccommon.constprop.0+0x158/0x164 [15278.203837][T50670] doel0svc+0x34/0xcc [15278.207834][T50670] el0svc+0x20/0x30
For details, see the following figure.
hclgeexit() sriovnumvfsstore() ... devicelock() pcidisablesriov() hns3pcisriovconfigure() pcidisablesriov() sriovdisable() sriovdisable() if !numVFs : if !numVFs : return; return; sriovdelvfs() sriovdelvfs() ... ... klistput() klistput() ... ... numVFs = 0; numVFs = 0; deviceunlock();
In this patch, when driver is removing, we get the devicelock() to protect numVFs, just like sriovnumvfsstore().(CVE-2024-50296)
In the Linux kernel, the following vulnerability has been resolved:
ipv4: iptunnel: Fix suspicious RCU usage warning in iptunnel_find()
The per-netns IP tunnel hash table is protected by the RTNL mutex and iptunnelfind() is only called from the control path where the mutex is taken.
Add a lockdep expression to hlistforeachentryrcu() in iptunnelfind() in order to validate that the mutex is held and to silence the suspicious RCU usage warning [1].
[1] WARNING: suspicious RCU usage
net/ipv4/ip_tunnel.c:221 RCU-list traversed in non-reader section!!
other info that might help us debug this:
rcuscheduleractive = 2, debuglocks = 1 1 lock held by ip/362: #0: ffffffff86fc7cb0 (rtnlmutex){+.+.}-{3:3}, at: rtnetlinkrcvmsg+0x377/0xf60
stack backtrace: CPU: 12 UID: 0 PID: 362 Comm: ip Not tainted 6.12.0-rc3-custom-gd95d9a31aceb #139 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Call Trace: <TASK> dumpstacklvl+0xba/0x110 lockdeprcususpicious.cold+0x4f/0xd6 iptunnelfind+0x435/0x4d0 iptunnelnewlink+0x517/0x7a0 ipgrenewlink+0x14c/0x170 rtnlnewlink+0x1173/0x19c0 rtnlnewlink+0x6c/0xa0 rtnetlinkrcvmsg+0x3cc/0xf60 netlinkrcvskb+0x171/0x450 netlinkunicast+0x539/0x7f0 netlinksendmsg+0x8c1/0xd80 _syssendmsg+0x8f9/0xc20 _syssendmsg+0x197/0x1e0 _syssendmsg+0x122/0x1f0 dosyscall64+0xbb/0x1d0 entrySYSCALL64afterhwframe+0x77/0x7f(CVE-2024-50304)
In the Linux kernel, the following vulnerability has been resolved:
drm/i915/hdcp: Add encoder check in intelhdcpget_capability
Sometimes during hotplug scenario or suspend/resume scenario encoder is not always initialized when intelhdcpget_capability add a check to avoid kernel null pointer dereference.(CVE-2024-53051)
In the Linux kernel, the following vulnerability has been resolved:
net: stmmac: TSO: Fix unbalanced DMA map/unmap for non-paged SKB data
In case the non-paged data of a SKB carries protocol header and protocol payload to be transmitted on a certain platform that the DMA AXI address width is configured to 40-bit/48-bit, or the size of the non-paged data is bigger than TSOMAXBUFF_SIZE on a certain platform that the DMA AXI address width is configured to 32-bit, then this SKB requires at least two DMA transmit descriptors to serve it.
For example, three descriptors are allocated to split one DMA buffer mapped from one piece of non-paged data: dmadesc[N + 0], dmadesc[N + 1], dmadesc[N + 2]. Then three elements of txq->txskbuffdma[] will be allocated to hold extra information to be reused in stmmactxclean(): txq->txskbuffdma[N + 0], txq->txskbuffdma[N + 1], txq->txskbuffdma[N + 2]. Now we focus on txq->txskbuffdma[entry].buf, which is the DMA buffer address returned by DMA mapping call. stmmactxclean() will try to unmap the DMA buffer ONLYIF_ txq->txskbuff_dma[entry].buf is a valid buffer address.
The expected behavior that saves DMA buffer address of this non-paged data to txq->txskbuffdma[entry].buf is: txq->txskbuffdma[N + 0].buf = NULL; txq->txskbuffdma[N + 1].buf = NULL; txq->txskbuffdma[N + 2].buf = dmamapsingle(); Unfortunately, the current code misbehaves like this: txq->txskbuffdma[N + 0].buf = dmamapsingle(); txq->txskbuffdma[N + 1].buf = NULL; txq->txskbuff_dma[N + 2].buf = NULL;
On the stmmactxclean() side, when dmadesc[N + 0] is closed by the DMA engine, txq->txskbuffdma[N + 0].buf is a valid buffer address obviously, then the DMA buffer will be unmapped immediately. There may be a rare case that the DMA engine does not finish the pending dmadesc[N + 1], dmadesc[N + 2] yet. Now things will go horribly wrong, DMA is going to access a unmapped/unreferenced memory region, corrupted data will be transmited or iommu fault will be triggered :(
In contrast, the for-loop that maps SKB fragments behaves perfectly as expected, and that is how the driver should do for both non-paged data and paged frags actually.
This patch corrects DMA map/unmap sequences by fixing the array index for txq->txskbuff_dma[entry].buf when assigning DMA buffer address.
Tested and verified on DWXGMAC CORE 3.20a(CVE-2024-53058)
In the Linux kernel, the following vulnerability has been resolved:
bpf: Add skisinet and ISICSK check in tlsswhasctx_tx/rx
As the introduction of the support for vsock and unix sockets in sockmap, tlsswhasctxtx/rx cannot presume the socket passed in must be ISICSK. vsock and afunix sockets have vsocksock and unixsock instead of inetconnectionsock. For these sockets, tlsgetctx may return an invalid pointer and cause page fault in function tlsswctx_rx.
BUG: unable to handle page fault for address: 0000000000040030 Workqueue: vsock-loopback vsockloopbackwork RIP: 0010:skpsockstrpdataready+0x23/0x60 Call Trace: ? _die+0x81/0xc3 ? nocontext+0x194/0x350 ? dopagefault+0x30/0x110 ? asyncpagefault+0x3e/0x50 ? skpsockstrpdataready+0x23/0x60 virtiotransportrecvpkt+0x750/0x800 ? updateloadavg+0x7e/0x620 vsockloopbackwork+0xd0/0x100 processonework+0x1a7/0x360 workerthread+0x30/0x390 ? createworker+0x1a0/0x1a0 kthread+0x112/0x130 ? _kthreadcancelwork+0x40/0x40 retfromfork+0x1f/0x40
v2: - Add IS_ICSK check v3: - Update the commits in Fixes(CVE-2024-53091)
In the Linux kernel, the following vulnerability has been resolved:
nvme-multipath: defer partition scanning
We need to suppress the partition scan from occuring within the controller's scanwork context. If a path error occurs here, the IO will wait until a path becomes available or all paths are torn down, but that action also occurs within scanwork, so it would deadlock. Defer the partion scan to a different context that does not block scan_work.(CVE-2024-53093)
In the Linux kernel, the following vulnerability has been resolved:
RDMA/siw: Add sendpageok() check to disable MSGSPLICE_PAGES
While running ISER over SIW, the initiator machine encounters a warning from skbsplicefromiter() indicating that a slab page is being used in sendpage. To address this, it is better to add a sendpageok() check within the driver itself, and if it returns 0, then MSGSPLICE_PAGES flag should be disabled before entering the network stack.
A similar issue has been discussed for NVMe in this thread: https://lore.kernel.org/all/20240530142417.146696-1-ofir.gal@volumez.com/
WARNING: CPU: 0 PID: 5342 at net/core/skbuff.c:7140 skbsplicefromiter+0x173/0x320 Call Trace: tcpsendmsglocked+0x368/0xe40 siwtxhdt+0x695/0xa40 [siw] siwqpsqprocess+0x102/0xb00 [siw] siwsqresume+0x39/0x110 [siw] siwrunsq+0x74/0x160 [siw] kthread+0xd2/0x100 retfromfork+0x34/0x40 retfromfork_asm+0x1a/0x30(CVE-2024-53094)
In the Linux kernel, the following vulnerability has been resolved:
mm: krealloc: Fix MTE false alarm in _dokrealloc
This patch addresses an issue introduced by commit 1a83a716ec233 ("mm: krealloc: consider spare memory for _GFPZERO") which causes MTE (Memory Tagging Extension) to falsely report a slab-out-of-bounds error.
The problem occurs when zeroing out spare memory in _dokrealloc. The original code only considered software-based KASAN and did not account for MTE. It does not reset the KASAN tag before calling memset, leading to a mismatch between the pointer tag and the memory tag, resulting in a false positive.
swapper/0: BUG: KASAN: slab-out-of-bounds in _memset+0x84/0x188 swapper/0: Write at addr f4ffff8005f0fdf0 by task swapper/0/1 swapper/0: Pointer tag: [f4], memory tag: [fe] swapper/0: swapper/0: CPU: 4 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.12. swapper/0: Hardware name: MT6991(ENG) (DT) swapper/0: Call trace: swapper/0: dumpbacktrace+0xfc/0x17c swapper/0: showstack+0x18/0x28 swapper/0: dumpstacklvl+0x40/0xa0 swapper/0: printreport+0x1b8/0x71c swapper/0: kasanreport+0xec/0x14c swapper/0: _dokernelfault+0x60/0x29c swapper/0: dobadarea+0x30/0xdc swapper/0: dotagcheckfault+0x20/0x34 swapper/0: domemabort+0x58/0x104 swapper/0: el1abort+0x3c/0x5c swapper/0: el1h64synchandler+0x80/0xcc swapper/0: el1h64sync+0x68/0x6c swapper/0: _memset+0x84/0x188 swapper/0: btfpopulatekfuncset+0x280/0x3d8 swapper/0: _registerbtfkfuncidset+0x43c/0x468 swapper/0: registerbtfkfuncidset+0x48/0x60 swapper/0: registernfnatbpf+0x1c/0x40 swapper/0: nfnatinit+0xc0/0x128 swapper/0: dooneinitcall+0x184/0x464 swapper/0: doinitcalllevel+0xdc/0x1b0 swapper/0: doinitcalls+0x70/0xc0 swapper/0: dobasicsetup+0x1c/0x28 swapper/0: kernelinitfreeable+0x144/0x1b8 swapper/0: kernelinit+0x20/0x1a8 swapper/0: retfrom_fork+0x10/0x20 ==================================================================(CVE-2024-53097)
In the Linux kernel, the following vulnerability has been resolved:
nvme: tcp: avoid race between queue_lock lock and destroy
Commit 76d54bf20cdc ("nvme-tcp: don't access released socket during error recovery") added a mutexlock() call for the queue->queuelock in nvmetcpgetaddress(). However, the mutexlock() races with mutexdestroy() in nvmetcpfreequeue(), and causes the WARN below.
DEBUGLOCKSWARNON(lock->magic != lock) WARNING: CPU: 3 PID: 34077 at kernel/locking/mutex.c:587 mutexlock+0xcf0/0x1220 Modules linked in: nvmettcp nvmet nvmetcp nvmefabrics iwcm ibcm ibcore pktcdvd nftfibinet nftfibipv4 nftfibipv6 nftfib nftrejectinet nfrejectipv4 nfrejectipv6 nftreject nftct nftchainnat nfnat nfconntrack nfdefragipv6 nfdefragipv4 ipset nftables qrtr sunrpc ppdev 9pnetvirtio 9pnet pcspkr netfs parportpc parport e1000 i2cpiix4 i2csmbus loop fuse nfnetlink zram bochs drmvramhelper drmttmhelper ttm drmkmshelper xfs drm sym53c8xx floppy nvme scsitransportspi nvmecore nvmeauth serioraw atageneric pataacpi dmmultipath qemufwcfg [last unloaded: ibuverbs] CPU: 3 UID: 0 PID: 34077 Comm: udisksd Not tainted 6.11.0-rc7 #319 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-2.fc40 04/01/2014 RIP: 0010:mutexlock+0xcf0/0x1220 Code: 08 84 d2 0f 85 c8 04 00 00 8b 15 ef b6 c8 01 85 d2 0f 85 78 f4 ff ff 48 c7 c6 20 93 ee af 48 c7 c7 60 91 ee af e8 f0 a7 6d fd <0f> 0b e9 5e f4 ff ff 48 b8 00 00 00 00 00 fc ff df 4c 89 f2 48 c1 RSP: 0018:ffff88811305f760 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffff88812c652058 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000001 RBP: ffff88811305f8b0 R08: 0000000000000001 R09: ffffed1075c36341 R10: ffff8883ae1b1a0b R11: 0000000000010498 R12: 0000000000000000 R13: 0000000000000000 R14: dffffc0000000000 R15: ffff88812c652058 FS: 00007f9713ae4980(0000) GS:ffff8883ae180000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fcd78483c7c CR3: 0000000122c38000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> ? warn.cold+0x5b/0x1af ? _mutexlock+0xcf0/0x1220 ? reportbug+0x1ec/0x390 ? handlebug+0x3c/0x80 ? excinvalidop+0x13/0x40 ? asmexcinvalidop+0x16/0x20 ? _mutexlock+0xcf0/0x1220 ? nvmetcpgetaddress+0xc2/0x1e0 [nvmetcp] ? _pfxmutexlock+0x10/0x10 ? _lockacquire+0xd6a/0x59e0 ? nvmetcpgetaddress+0xc2/0x1e0 [nvmetcp] nvmetcpgetaddress+0xc2/0x1e0 [nvmetcp] ? _pfxnvmetcpgetaddress+0x10/0x10 [nvmetcp] nvmesysfsshowaddress+0x81/0xc0 [nvmecore] devattrshow+0x42/0x80 ? _asanmemset+0x1f/0x40 sysfskfseqshow+0x1f0/0x370 seqreaditer+0x2cb/0x1130 ? rwverifyarea+0x3b1/0x590 ? _mutexlock+0x433/0x1220 vfsread+0x6a6/0xa20 ? lockdephardirqson+0x78/0x100 ? _pfxvfsread+0x10/0x10 ksysread+0xf7/0x1d0 ? _pfxksysread+0x10/0x10 ? _x64sysopenat+0x105/0x1d0 dosyscall64+0x93/0x180 ? lockdephardirqsonprepare+0x16d/0x400 ? dosyscall64+0x9f/0x180 ? lockdephardirqson+0x78/0x100 ? dosyscall64+0x9f/0x180 ? _pfxksysread+0x10/0x10 ? lockdephardirqsonprepare+0x16d/0x400 ? dosyscall64+0x9f/0x180 ? lockdephardirqson+0x78/0x100 ? dosyscall64+0x9f/0x180 ? lockdephardirqsonprepare+0x16d/0x400 ? dosyscall64+0x9f/0x180 ? lockdephardirqson+0x78/0x100 ? dosyscall64+0x9f/0x180 ? lockdephardirqsonprepare+0x16d/0x400 ? dosyscall64+0x9f/0x180 ? lockdephardirqson+0x78/0x100 ? dosyscall64+0x9f/0x180 ? lockdephardirqsonprepare+0x16d/0x400 ? dosyscall64+0x9f/0x180 ? lockdephardirqson+0x78/0x100 ? dosyscall64+0x9f/0x180 ? dosyscall64+0x9f/0x180 entrySYSCALL64after_hwframe+0x76/0x7e RIP: 0033:0x7f9713f55cfa Code: 55 48 89 e5 48 83 ec 20 48 89 55 e8 48 89 75 f0 89 7d f8 e8 e8 74 f8 ff 48 8b 55 e8 48 8b 75 f0 4 ---truncated---(CVE-2024-53100)
In the Linux kernel, the following vulnerability has been resolved:
ima: fix buffer overrun in imaeventdigestinit_common
Function imaeventdigestinit() calls imaeventdigestinitcommon() with HASHALGO_LAST which is then used to access the array hashdigest_size[] leading to buffer overrun. Have a conditional statement to handle this.(CVE-2024-53106)
In the Linux kernel, the following vulnerability has been resolved:
nommu: pass NULL argument to vmaiterprealloc()
When deleting a vma entry from a maple tree, it has to pass NULL to vmaiterprealloc() in order to calculate internal state of the tree, but it passed a wrong argument. As a result, nommu kernels crashed upon accessing a vma iterator, such as acctcollect() reading the size of vma entries after domunmap().
This commit fixes this issue by passing a right argument to the preallocation call.(CVE-2024-53109)
In the Linux kernel, the following vulnerability has been resolved:
mm: fix NULL pointer dereference in allocpagesbulk_noprof
We triggered a NULL pointer dereference for ac.preferredzoneref->zone in allocpagesbulknoprof() when the task is migrated between cpusets.
When cpuset is enabled, in prepareallocpages(), ac->nodemask may be ¤t->memsallowed. when firstzoneszonelist() is called to find preferredzoneref, the ac->nodemask may be modified concurrently if the task is migrated between different cpusets. Assuming we have 2 NUMA Node, when traversing Node1 in ac->zonelist, the nodemask is 2, and when traversing Node2 in ac->zonelist, the nodemask is 1. As a result, the ac->preferred_zoneref points to NULL zone.
In allocpagesbulknoprof(), foreachzonezonelistnodemask() finds a allowable zone and calls zonelistnodeidx(ac.preferredzoneref), leading to NULL pointer dereference.
_allocpagesnoprof() fixes this issue by checking NULL pointer in commit ea57485af8f4 ("mm, pagealloc: fix check for NULL preferredzone") and commit df76cee6bbeb ("mm, pagealloc: remove redundant checks from alloc fastpath").
To fix it, check NULL pointer for preferred_zoneref->zone.(CVE-2024-53113)
In the Linux kernel, the following vulnerability has been resolved:
virtio/vsock: Fix accept_queue memory leak
As the final stages of socket destruction may be delayed, it is possible that virtiotransportrecvlisten() will be called after the acceptqueue has been flushed, but before the SOCK_DONE flag has been set. As a result, sockets enqueued after the flush would remain unremoved, leading to a memory leak.
vsockrelease _vsockrelease lock virtiotransportrelease virtiotransportclose scheduledelayedwork(closework) skshutdown = SHUTDOWNMASK (!) flush acceptqueue release virtiotransportrecvpkt vsockfindboundsocket lock if flag(SOCKDONE) return virtiotransportrecvlisten child = vsockcreateconnected (!) vsockenqueueaccept(child) release closework lock virtiotransportdoclose setflag(SOCKDONE) virtiotransportremovesock vsockremovesock vsockremovebound release
Introduce a skshutdown check to disallow vsockenqueue_accept() during socket destruction.
unreferenced object 0xffff888109e3f800 (size 2040): comm "kworker/5:2", pid 371, jiffies 4294940105 hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 28 00 0b 40 00 00 00 00 00 00 00 00 00 00 00 00 (..@............ backtrace (crc 9e5f4e84): [<ffffffff81418ff1>] kmemcacheallocnoprof+0x2c1/0x360 [<ffffffff81d27aa0>] skprotalloc+0x30/0x120 [<ffffffff81d2b54c>] skalloc+0x2c/0x4b0 [<ffffffff81fe049a>] _vsockcreate.constprop.0+0x2a/0x310 [<ffffffff81fe6d6c>] virtiotransportrecvpkt+0x4dc/0x9a0 [<ffffffff81fe745d>] vsockloopbackwork+0xfd/0x140 [<ffffffff810fc6ac>] processonework+0x20c/0x570 [<ffffffff810fce3f>] workerthread+0x1bf/0x3a0 [<ffffffff811070dd>] kthread+0xdd/0x110 [<ffffffff81044fdd>] retfromfork+0x2d/0x50 [<ffffffff8100785a>] retfromfork_asm+0x1a/0x30(CVE-2024-53119)
In the Linux kernel, the following vulnerability has been resolved:
net/mlx5e: CT: Fix null-ptr-deref in add rule err flow
In error flow of mlx5tcctentryaddrule(), in case ctruleadd() callback returns error, zonerule->attr is used uninitiated. Fix it to use attr which has the needed pointer value.
Kernel log: BUG: kernel NULL pointer dereference, address: 0000000000000110 RIP: 0010:mlx5tcctentryaddrule+0x2b1/0x2f0 [mlx5core] … Call Trace: <TASK> ? _die+0x20/0x70 ? pagefaultoops+0x150/0x3e0 ? excpagefault+0x74/0x140 ? asmexcpagefault+0x22/0x30 ? mlx5tcctentryaddrule+0x2b1/0x2f0 [mlx5core] ? mlx5tcctentryaddrule+0x1d5/0x2f0 [mlx5core] mlx5tcctblockflowoffload+0xc6a/0xf90 [mlx5core] ? nfflowoffloadtuple+0xd8/0x190 [nfflowtable] nfflowoffloadtuple+0xd8/0x190 [nfflowtable] flowoffloadworkhandler+0x142/0x320 [nfflowtable] ? finishtaskswitch.isra.0+0x15b/0x2b0 processonework+0x16c/0x320 workerthread+0x28c/0x3a0 ? _pfxworkerthread+0x10/0x10 kthread+0xb8/0xf0 ? _pfxkthread+0x10/0x10 retfromfork+0x2d/0x50 ? _pfxkthread+0x10/0x10 retfromforkasm+0x1a/0x30 </TASK>(CVE-2024-53120)
In the Linux kernel, the following vulnerability has been resolved:
net/mlx5: fs, lock FTE when checking if active
The referenced commits introduced a two-step process for deleting FTEs:
However, this approach encounters a race condition if a rule with the same match value is added simultaneously. In this scenario, fs_core may set the hardware deletion function to NULL prematurely, causing a panic during subsequent rule deletions.
To prevent this, ensure the active flag of the FTE is checked under a lock, which will prevent the fs_core layer from attaching a new steering rule to an FTE that is in the process of deletion.
[ 438.967589] MOSHE: 2496 mlx5delflowrules delhwfunc [ 438.968205] ------------[ cut here ]------------ [ 438.968654] refcountt: decrement hit 0; leaking memory. [ 438.969249] WARNING: CPU: 0 PID: 8957 at lib/refcount.c:31 refcountwarnsaturate+0xfb/0x110 [ 438.970054] Modules linked in: actmirred clsflower actgact schingress openvswitch nsh mlx5vdpa vringh vhostiotlb vdpa mlx5ib mlx5core xtconntrack xtMASQUERADE nfconntracknetlink nfnetlink xtaddrtype iptablenat nfnat brnetfilter rpcsecgsskrb5 authrpcgss oidregistry overlay rpcrdma rdmaucm ibiser libiscsi scsitransportiscsi ibumad rdmacm ibipoib iwcm ibcm ibuverbs ibcore zram zsmalloc fuse [last unloaded: clsflower] [ 438.973288] CPU: 0 UID: 0 PID: 8957 Comm: tc Not tainted 6.12.0-rc1+ #8 [ 438.973888] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 [ 438.974874] RIP: 0010:refcountwarnsaturate+0xfb/0x110 [ 438.975363] Code: 40 66 3b 82 c6 05 16 e9 4d 01 01 e8 1f 7c a0 ff 0f 0b c3 cc cc cc cc 48 c7 c7 10 66 3b 82 c6 05 fd e8 4d 01 01 e8 05 7c a0 ff <0f> 0b c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 90 [ 438.976947] RSP: 0018:ffff888124a53610 EFLAGS: 00010286 [ 438.977446] RAX: 0000000000000000 RBX: ffff888119d56de0 RCX: 0000000000000000 [ 438.978090] RDX: ffff88852c828700 RSI: ffff88852c81b3c0 RDI: ffff88852c81b3c0 [ 438.978721] RBP: ffff888120fa0e88 R08: 0000000000000000 R09: ffff888124a534b0 [ 438.979353] R10: 0000000000000001 R11: 0000000000000001 R12: ffff888119d56de0 [ 438.979979] R13: ffff888120fa0ec0 R14: ffff888120fa0ee8 R15: ffff888119d56de0 [ 438.980607] FS: 00007fe6dcc0f800(0000) GS:ffff88852c800000(0000) knlGS:0000000000000000 [ 438.983984] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 438.984544] CR2: 00000000004275e0 CR3: 0000000186982001 CR4: 0000000000372eb0 [ 438.985205] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 438.985842] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 438.986507] Call Trace: [ 438.986799] <TASK> [ 438.987070] ? _warn+0x7d/0x110 [ 438.987426] ? refcountwarnsaturate+0xfb/0x110 [ 438.987877] ? reportbug+0x17d/0x190 [ 438.988261] ? prbreadvalid+0x17/0x20 [ 438.988659] ? handlebug+0x53/0x90 [ 438.989054] ? excinvalidop+0x14/0x70 [ 438.989458] ? asmexcinvalidop+0x16/0x20 [ 438.989883] ? refcountwarnsaturate+0xfb/0x110 [ 438.990348] mlx5delflowrules+0x2f7/0x340 [mlx5core] [ 438.990932] _mlx5eswitchdelrule+0x49/0x170 [mlx5core] [ 438.991519] ? mlx5lagissriov+0x3c/0x50 [mlx5core] [ 438.992054] ? xasload+0x9/0xb0 [ 438.992407] mlx5etcruleunoffload+0x45/0xe0 [mlx5core] [ 438.993037] mlx5etcdelfdbflow+0x2a6/0x2e0 [mlx5core] [ 438.993623] mlx5eflowput+0x29/0x60 [mlx5core] [ 438.994161] mlx5edeleteflower+0x261/0x390 [mlx5core] [ 438.994728] tcsetupcbdestroy+0xb9/0x190 [ 438.995150] flhwdestroyfilter+0x94/0xc0 [clsflower] [ 438.995650] flchange+0x11a4/0x13c0 [clsflower] [ 438.996105] tcnewtfilter+0x347/0xbc0 [ 438.996503] ? __ ---truncated---(CVE-2024-53121)
In the Linux kernel, the following vulnerability has been resolved:
mptcp: cope racing subflow creation in mptcprcvspace_adjust
Additional active subflows - i.e. created by the in kernel path manager - are included into the subflow list before starting the 3whs.
A racing recvmsg() spooling data received on an already established subflow would unconditionally call tcpcleanuprbuf() on all the current subflows, potentially hitting a divide by zero error on the newly created ones.
Explicitly check that the subflow is in a suitable state before invoking tcpcleanuprbuf().(CVE-2024-53122)
In the Linux kernel, the following vulnerability has been resolved:
mptcp: error out earlier on disconnect
Eric reported a division by zero splat in the MPTCP protocol:
Oops: divide error: 0000 [#1] PREEMPT SMP KASAN PTI CPU: 1 UID: 0 PID: 6094 Comm: syz-executor317 Not tainted 6.12.0-rc5-syzkaller-00291-g05b92660cdfe #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 RIP: 0010:_tcpselectwindow+0x5b4/0x1310 net/ipv4/tcpoutput.c:3163 Code: f6 44 01 e3 89 df e8 9b 75 09 f8 44 39 f3 0f 8d 11 ff ff ff e8 0d 74 09 f8 45 89 f4 e9 04 ff ff ff e8 00 74 09 f8 44 89 f0 99 <f7> 7c 24 14 41 29 d6 45 89 f4 e9 ec fe ff ff e8 e8 73 09 f8 48 89 RSP: 0018:ffffc900041f7930 EFLAGS: 00010293 RAX: 0000000000017e67 RBX: 0000000000017e67 RCX: ffffffff8983314b RDX: 0000000000000000 RSI: ffffffff898331b0 RDI: 0000000000000004 RBP: 00000000005d6000 R08: 0000000000000004 R09: 0000000000017e67 R10: 0000000000003e80 R11: 0000000000000000 R12: 0000000000003e80 R13: ffff888031d9b440 R14: 0000000000017e67 R15: 00000000002eb000 FS: 00007feb5d7f16c0(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007feb5d8adbb8 CR3: 0000000074e4c000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> _tcpcleanuprbuf+0x3e7/0x4b0 net/ipv4/tcp.c:1493 mptcprcvspaceadjust net/mptcp/protocol.c:2085 [inline] mptcprecvmsg+0x2156/0x2600 net/mptcp/protocol.c:2289 inetrecvmsg+0x469/0x6a0 net/ipv4/afinet.c:885 sockrecvmsgnosec net/socket.c:1051 [inline] sockrecvmsg+0x1b2/0x250 net/socket.c:1073 _sysrecvfrom+0x1a5/0x2e0 net/socket.c:2265 _dosysrecvfrom net/socket.c:2283 [inline] _sesysrecvfrom net/socket.c:2279 [inline] _x64sysrecvfrom+0xe0/0x1c0 net/socket.c:2279 dosyscallx64 arch/x86/entry/common.c:52 [inline] dosyscall64+0xcd/0x250 arch/x86/entry/common.c:83 entrySYSCALL64afterhwframe+0x77/0x7f RIP: 0033:0x7feb5d857559 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 51 18 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007feb5d7f1208 EFLAGS: 00000246 ORIGRAX: 000000000000002d RAX: ffffffffffffffda RBX: 00007feb5d8e1318 RCX: 00007feb5d857559 RDX: 000000800000000e RSI: 0000000000000000 RDI: 0000000000000003 RBP: 00007feb5d8e1310 R08: 0000000000000000 R09: ffffffff81000000 R10: 0000000000000100 R11: 0000000000000246 R12: 00007feb5d8e131c R13: 00007feb5d8ae074 R14: 000000800000000e R15: 00000000fffffdef
and provided a nice reproducer.
The root cause is the current bad handling of racing disconnect. After the blamed commit below, skwaitdata() can return (with error) with the underlying socket disconnected and a zero rcv_mss.
Catch the error and return without performing any additional operations on the current socket.(CVE-2024-53123)
In the Linux kernel, the following vulnerability has been resolved:
net: fix data-races around sk->skforwardalloc
Syzkaller reported this warning: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 16 at net/ipv4/afinet.c:156 inetsockdestruct+0x1c5/0x1e0 Modules linked in: CPU: 0 UID: 0 PID: 16 Comm: ksoftirqd/0 Not tainted 6.12.0-rc5 #26 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 RIP: 0010:inetsockdestruct+0x1c5/0x1e0 Code: 24 12 4c 89 e2 5b 48 c7 c7 98 ec bb 82 41 5c e9 d1 18 17 ff 4c 89 e6 5b 48 c7 c7 d0 ec bb 82 41 5c e9 bf 18 17 ff 0f 0b eb 83 <0f> 0b eb 97 0f 0b eb 87 0f 0b e9 68 ff ff ff 66 66 2e 0f 1f 84 00 RSP: 0018:ffffc9000008bd90 EFLAGS: 00010206 RAX: 0000000000000300 RBX: ffff88810b172a90 RCX: 0000000000000007 RDX: 0000000000000002 RSI: 0000000000000300 RDI: ffff88810b172a00 RBP: ffff88810b172a00 R08: ffff888104273c00 R09: 0000000000100007 R10: 0000000000020000 R11: 0000000000000006 R12: ffff88810b172a00 R13: 0000000000000004 R14: 0000000000000000 R15: ffff888237c31f78 FS: 0000000000000000(0000) GS:ffff888237c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffc63fecac8 CR3: 000000000342e000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> ? _warn+0x88/0x130 ? inetsockdestruct+0x1c5/0x1e0 ? reportbug+0x18e/0x1a0 ? handlebug+0x53/0x90 ? excinvalidop+0x18/0x70 ? asmexcinvalidop+0x1a/0x20 ? inetsockdestruct+0x1c5/0x1e0 _skdestruct+0x2a/0x200 rcudobatch+0x1aa/0x530 ? rcudobatch+0x13b/0x530 rcucore+0x159/0x2f0 handlesoftirqs+0xd3/0x2b0 ? _pfxsmpbootthreadfn+0x10/0x10 runksoftirqd+0x25/0x30 smpbootthreadfn+0xdd/0x1d0 kthread+0xd3/0x100 ? _pfxkthread+0x10/0x10 retfromfork+0x34/0x50 ? _pfxkthread+0x10/0x10 retfromfork_asm+0x1a/0x30 </TASK> ---[ end trace 0000000000000000 ]---
Its possible that two threads call tcpv6dorcv()/skforwardallocadd() concurrently when sk->skstate == TCPLISTEN with sk->sklock unlocked, which triggers a data-race around sk->skforwardalloc: tcpv6rcv tcpv6dorcv skbcloneandcharger skrmemschedule _skmemschedule skforwardallocadd() skbsetownerr skmemcharge skforwardallocadd() _kfreeskb skbreleaseall skbreleaseheadstate sockrfree skmemuncharge skforwardallocadd() skmemreclaim // set local var reclaimable _skmemreclaim skforwardalloc_add()
In this syzkaller testcase, two threads call tcpv6dorcv() with skb->truesize=768, the skforwardalloc changes like this: (cpu 1) | (cpu 2) | skforwardalloc ... | ... | 0 _skmemschedule() | | +4096 = 4096 | _skmemschedule() | +4096 = 8192 skmemcharge() | | -768 = 7424 | skmemcharge() | -768 = 6656 ... | ... | skmemuncharge() | | +768 = 7424 reclaimable=7424 | | | skmemuncharge() | +768 = 8192 | reclaimable=8192 | _skmemreclaim() | | -4096 = 4096 | _skmem_reclaim() | -8192 = -4096 != 0
The skbcloneandcharger() should not be called in tcpv6dorcv() when sk->skstate is TCPLISTEN, it happens later in tcpv6synrecvsock(). Fix the same issue in dccpv6dorcv().(CVE-2024-53124)
In the Linux kernel, the following vulnerability has been resolved:
KVM: VMX: Bury Intel PT virtualization (guest/host mode) behind CONFIG_BROKEN
Hide KVM's ptmode module param behind CONFIGBROKEN, i.e. disable support for virtualizing Intel PT via guest/host mode unless BROKEN=y. There are myriad bugs in the implementation, some of which are fatal to the guest, and others which put the stability and health of the host at risk.
For guest fatalities, the most glaring issue is that KVM fails to ensure tracing is disabled, and stays disabled prior to VM-Enter, which is necessary as hardware disallows loading (the guest's) RTIT_CTL if tracing is enabled (enforced via a VMX consistency check). Per the SDM:
If the logical processor is operating with Intel PT enabled (if IA32RTITCTL.TraceEn = 1) at the time of VM entry, the "load IA32RTITCTL" VM-entry control must be 0.
On the host side, KVM doesn't validate the guest CPUID configuration provided by userspace, and even worse, uses the guest configuration to decide what MSRs to save/load at VM-Enter and VM-Exit. E.g. configuring guest CPUID to enumerate more address ranges than are supported in hardware will result in KVM trying to passthrough, save, and load non-existent MSRs, which generates a variety of WARNs, ToPA ERRORs in the host, a potential deadlock, etc.(CVE-2024-53135)
In the Linux kernel, the following vulnerability has been resolved:
net/mlx5e: kTLS, Fix incorrect page refcounting
The kTLS tx handling code is using a mix of getpage() and pagerefinc() APIs to increment the page reference. But on the release path (mlx5ektlstxhandleresyncdumpcomp()), only putpage() is used.
This is an issue when using pages from large folios: the getpage() references are stored on the folio page while the pageref_inc() references are stored directly in the given page. On release the folio page will be dereferenced too many times.
This was found while doing kTLS testing with sendfile() + ZC when the served file was read from NFS on a kernel with NFS large folios support (commit 49b29a573da8 ("nfs: add support for large folios")).(CVE-2024-53138)
In the Linux kernel, the following vulnerability has been resolved:
sctp: fix possible UAF in sctpv6available()
A lockdep report [1] with CONFIGPROVERCULIST=y hints that sctpv6available() is calling devgetbyindexrcu() and ipv6chk_addr() without holding rcu.
[1] ============================= WARNING: suspicious RCU usage 6.12.0-rc5-virtme #1216 Tainted: G W
net/core/dev.c:876 RCU-list traversed in non-reader section!!
other info that might help us debug this:
rcuscheduleractive = 2, debuglocks = 1 1 lock held by sctphello/31495: #0: ffff9f1ebbdb7418 (sklock-AFINET6){+.+.}-{0:0}, at: sctpbind (./arch/x86/include/asm/jumplabel.h:27 net/sctp/socket.c:315) sctp
stack backtrace: CPU: 7 UID: 0 PID: 31495 Comm: sctphello Tainted: G W 6.12.0-rc5-virtme #1216 Tainted: [W]=WARN Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 Call Trace: <TASK> dumpstacklvl (lib/dumpstack.c:123) lockdeprcususpicious (kernel/locking/lockdep.c:6822) devgetbyindexrcu (net/core/dev.c:876 (discriminator 7)) sctpv6available (net/sctp/ipv6.c:701) sctp sctpdobind (net/sctp/socket.c:400 (discriminator 1)) sctp sctpbind (net/sctp/socket.c:320) sctp inet6bindsk (net/ipv6/afinet6.c:465) ? securitysocketbind (security/security.c:4581 (discriminator 1)) _sysbind (net/socket.c:1848 net/socket.c:1869) ? douseraddrfault (./include/linux/rcupdate.h:347 ./include/linux/rcupdate.h:880 ./include/linux/mm.h:729 arch/x86/mm/fault.c:1340) ? douseraddrfault (./arch/x86/include/asm/preempt.h:84 (discriminator 13) ./include/linux/rcupdate.h:98 (discriminator 13) ./include/linux/rcupdate.h:882 (discriminator 13) ./include/linux/mm.h:729 (discriminator 13) arch/x86/mm/fault.c:1340 (discriminator 13)) _x64sysbind (net/socket.c:1877 (discriminator 1) net/socket.c:1875 (discriminator 1) net/socket.c:1875 (discriminator 1)) dosyscall64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1)) entrySYSCALL64afterhwframe (arch/x86/entry/entry64.S:130) RIP: 0033:0x7f59b934a1e7 Code: 44 00 00 48 8b 15 39 8c 0c 00 f7 d8 64 89 02 b8 ff ff ff ff eb bd 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 b8 31 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 09 8c 0c 00 f7 d8 64 89 01 48
0: 44 00 00 add %r8b,(%rax) 3: 48 8b 15 39 8c 0c 00 mov 0xc8c39(%rip),%rdx # 0xc8c43 a: f7 d8 neg %eax c: 64 89 02 mov %eax,%fs:(%rdx) f: b8 ff ff ff ff mov $0xffffffff,%eax 14: eb bd jmp 0xffffffffffffffd3 16: 66 2e 0f 1f 84 00 00 cs nopw 0x0(%rax,%rax,1) 1d: 00 00 00 20: 0f 1f 00 nopl (%rax) 23: b8 31 00 00 00 mov $0x31,%eax 28: 0f 05 syscall 2a:* 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax <-- trapping instruction 30: 73 01 jae 0x33 32: c3 ret 33: 48 8b 0d 09 8c 0c 00 mov 0xc8c09(%rip),%rcx # 0xc8c43 3a: f7 d8 neg %eax 3c: 64 89 01 mov %eax,%fs:(%rcx) 3f: 48 rex.W
0: 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax 6: 73 01 jae 0x9 8: c3 ret 9: 48 8b 0d 09 8c 0c 00 mov 0xc8c09(%rip),%rcx # 0xc8c19 10: f7 d8 neg %eax 12: 64 89 01 mov %eax,%fs:(%rcx) 15: 48 rex.W RSP: 002b:00007ffe2d0ad398 EFLAGS: 00000202 ORIG_RAX: 0000000000000031 RAX: ffffffffffffffda RBX: 00007ffe2d0ad3d0 RCX: 00007f59b934a1e7 RDX: 000000000000001c RSI: 00007ffe2d0ad3d0 RDI: 0000000000000005 RBP: 0000000000000005 R08: 1999999999999999 R09: 0000000000000000 R10: 00007f59b9253298 R11: 000000000000 ---truncated---(CVE-2024-53139)
In the Linux kernel, the following vulnerability has been resolved:
netlink: terminate outstanding dump on socket close
Netlink supports iterative dumping of data. It provides the families the following ops: - start - (optional) kicks off the dumping process - dump - actual dump helper, keeps getting called until it returns 0 - done - (optional) pairs with .start, can be used for cleanup The whole process is asynchronous and the repeated calls to .dump don't actually happen in a tight loop, but rather are triggered in response to recvmsg() on the socket.
This gives the user full control over the dump, but also means that the user can close the socket without getting to the end of the dump. To make sure .start is always paired with .done we check if there is an ongoing dump before freeing the socket, and if so call .done.
The complication is that sockets can get freed from BH and .done is allowed to sleep. So we use a workqueue to defer the call, when needed.
Unfortunately this does not work correctly. What we defer is not the cleanup but rather releasing a reference on the socket. We have no guarantee that we own the last reference, if someone else holds the socket they may release it in BH and we're back to square one.
The whole dance, however, appears to be unnecessary. Only the user can interact with dumps, so we can clean up when socket is closed. And close always happens in process context. Some async code may still access the socket after close, queue notification skbs to it etc. but no dumps can start, end or otherwise make progress.
Delete the workqueue and flush the dump state directly from the release handler. Note that further cleanup is possible in -next, for instance we now always call .done before releasing the main module reference, so dump doesn't have to take a reference of its own.(CVE-2024-53140)
In the Linux kernel, the following vulnerability has been resolved:
Bluetooth: hcievent: Align BR/EDR JUSTWORKS paring with LE
This aligned BR/EDR JUSTWORKS method with LE which since 92516cd97fd4 ("Bluetooth: Always request for user confirmation for Just Works") always request user confirmation with confirmhint set since the likes of bluetoothd have dedicated policy around JUST_WORKS method (e.g. main.conf:JustWorksRepairing).
CVE: CVE-2024-8805(CVE-2024-53144)
In the Linux kernel, the following vulnerability has been resolved:
um: Fix potential integer overflow during physmem setup
This issue happens when the real map size is greater than LONG_MAX, which can be easily triggered on UML/i386.(CVE-2024-53145)
In the Linux kernel, the following vulnerability has been resolved:
block, bfq: fix bfqq uaf in bfqlimitdepth()
Set new allocated bfqq to bic or remove freed bfqq from bic are both protected by bfqd->lock, however bfqlimitdepth() is deferencing bfqq from bic without the lock, this can lead to UAF if the io_context is shared by multiple tasks.
For example, test bfq with io_uring can trigger following UAF in v6.6:
================================================================== BUG: KASAN: slab-use-after-free in bfqq_group+0x15/0x50
Call Trace: <TASK> dumpstacklvl+0x47/0x80 printaddressdescription.constprop.0+0x66/0x300 printreport+0x3e/0x70 kasanreport+0xb4/0xf0 bfqqgroup+0x15/0x50 bfqqrequestoverlimit+0x130/0x9a0 bfqlimitdepth+0x1b5/0x480 _blkmqallocrequests+0x2b5/0xa00 blkmqgetnewrequests+0x11d/0x1d0 blkmqsubmitbio+0x286/0xb00 submitbionoacctnocheck+0x331/0x400 _blockwritefullfolio+0x3d0/0x640 writepagecb+0x3b/0xc0 writecachepages+0x254/0x6c0 writecachepages+0x254/0x6c0 dowritepages+0x192/0x310 filemapfdatawritewbc+0x95/0xc0 _filemapfdatawriterange+0x99/0xd0 filemapwriteandwaitrange.part.0+0x4d/0xa0 blkdevreaditer+0xef/0x1e0 ioread+0x1b6/0x8a0 ioissuesqe+0x87/0x300 iowqsubmitwork+0xeb/0x390 ioworkerhandlework+0x24d/0x550 iowqworker+0x27f/0x6c0 retfromfork_asm+0x1b/0x30 </TASK>
Allocated by task 808602: kasansavestack+0x1e/0x40 kasansettrack+0x21/0x30 _kasanslaballoc+0x83/0x90 kmemcacheallocnode+0x1b1/0x6d0 bfqgetqueue+0x138/0xfa0 bfqgetbfqqhandlesplit+0xe3/0x2c0 bfqinitrq+0x196/0xbb0 bfqinsertrequest.isra.0+0xb5/0x480 bfqinsertrequests+0x156/0x180 blkmqinsertrequest+0x15d/0x440 blkmqsubmitbio+0x8a4/0xb00 submitbionoacctnocheck+0x331/0x400 _blkdevdirectIOasync+0x2dd/0x330 blkdevwriteiter+0x39a/0x450 iowrite+0x22a/0x840 ioissuesqe+0x87/0x300 iowqsubmitwork+0xeb/0x390 ioworkerhandlework+0x24d/0x550 iowqworker+0x27f/0x6c0 retfromfork+0x2d/0x50 retfromfork_asm+0x1b/0x30
Freed by task 808589: kasansavestack+0x1e/0x40 kasansettrack+0x21/0x30 kasansavefreeinfo+0x27/0x40 _kasanslabfree+0x126/0x1b0 kmemcachefree+0x10c/0x750 bfqputqueue+0x2dd/0x770 _bfqinsertrequest.isra.0+0x155/0x7a0 bfqinsertrequest.isra.0+0x122/0x480 bfqinsertrequests+0x156/0x180 blkmqdispatchpluglist+0x528/0x7e0 blkmqflushpluglist.part.0+0xe5/0x590 _blkflushplug+0x3b/0x90 blkfinishplug+0x40/0x60 dowritepages+0x19d/0x310 filemapfdatawritewbc+0x95/0xc0 _filemapfdatawriterange+0x99/0xd0 filemapwriteandwaitrange.part.0+0x4d/0xa0 blkdevreaditer+0xef/0x1e0 ioread+0x1b6/0x8a0 ioissuesqe+0x87/0x300 iowqsubmitwork+0xeb/0x390 ioworkerhandlework+0x24d/0x550 iowqworker+0x27f/0x6c0 retfromfork+0x2d/0x50 retfromforkasm+0x1b/0x30
Fix the problem by protecting bictobfqq() with bfqd->lock.(CVE-2024-53166)
In the Linux kernel, the following vulnerability has been resolved:drm/amd/display: Fix null check for pipectx->planestate in dcn20programpipeThis commit addresses a null pointer dereference issue indcn20programpipe(). Previously, commit 8e4ed3cf1642 ( drm/amd/display:Add null check for pipectx->planestate in dcn20programpipe )partially fixed the null pointer dereference issue. However, indcn20updatedchubpdpp(), the variable pipectx is passed in, andplanestate is accessed again through pipectx. Multiple if statementsdirectly call attributes of plane_state, leading to potential nullpointer dereference issues. This patch adds necessary null checks toensure stability.(CVE-2024-53201)
In the Linux kernel, the following vulnerability has been resolved:
tcp: Fix use-after-free of nreq in reqsktimerhandler().
The cited commit replaced inetcskreqskqueuedropandput() with _inetcskreqskqueuedrop() and reqskput() in reqsktimerhandler().
Then, oreq should be passed to reqsk_put() instead of req; otherwise use-after-free of nreq could happen when reqsk is migrated but the retry attempt failed (e.g. due to timeout).
Let's pass oreq to reqsk_put().(CVE-2024-53206)
In the Linux kernel, the following vulnerability has been resolved:
Bluetooth: MGMT: Fix possible deadlocks
This fixes possible deadlocks like the following caused by hcicmdsync_dequeue causing the destroy function to run:
INFO: task kworker/u19:0:143 blocked for more than 120 seconds. Tainted: G W O 6.8.0-2024-03-19-intel-next-iLS-24ww14 #1 "echo 0 > /proc/sys/kernel/hungtasktimeoutsecs" disables this message. task:kworker/u19:0 state:D stack:0 pid:143 tgid:143 ppid:2 flags:0x00004000 Workqueue: hci0 hcicmdsyncwork [bluetooth] Call Trace: <TASK> _schedule+0x374/0xaf0 schedule+0x3c/0xf0 schedulepreemptdisabled+0x1c/0x30 _mutexlock.constprop.0+0x3ef/0x7a0 _mutexlockslowpath+0x13/0x20 mutexlock+0x3c/0x50 mgmtsetconnectablecomplete+0xa4/0x150 [bluetooth] ? kfree+0x211/0x2a0 hcicmdsyncdequeue+0xae/0x130 [bluetooth] ? _pfxcmdcompletersp+0x10/0x10 [bluetooth] cmdcompletersp+0x26/0x80 [bluetooth] mgmtpendingforeach+0x4d/0x70 [bluetooth] _mgmtpoweroff+0x8d/0x180 [bluetooth] ? rawspinunlockirq+0x23/0x40 hcidevclosesync+0x445/0x5b0 [bluetooth] hcisetpoweredsync+0x149/0x250 [bluetooth] setpoweredsync+0x24/0x60 [bluetooth] hcicmdsyncwork+0x90/0x150 [bluetooth] processonework+0x13e/0x300 workerthread+0x2f7/0x420 ? _pfxworkerthread+0x10/0x10 kthread+0x107/0x140 ? _pfxkthread+0x10/0x10 retfromfork+0x3d/0x60 ? _pfxkthread+0x10/0x10 retfromforkasm+0x1b/0x30 </TASK>(CVE-2024-53207)
In the Linux kernel, the following vulnerability has been resolved:
bnxt_en: Fix receive ring space parameters when XDP is active
The MTU setting at the time an XDP multi-buffer is attached determines whether the aggregation ring will be used and the rxskbfunc handler. This is done in bnxtsetrxskbmode().
If the MTU is later changed, the aggregation ring setting may need to be changed and it may become out-of-sync with the settings initially done in bnxtsetrxskbmode(). This may result in random memory corruption and crashes as the HW may DMA data larger than the allocated buffer size, such as:
BUG: kernel NULL pointer dereference, address: 00000000000003c0 PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 17 PID: 0 Comm: swapper/17 Kdump: loaded Tainted: G S OE 6.1.0-226bf9805506 #1 Hardware name: Wiwynn Delta Lake PVT BZA.02601.0150/Delta Lake-Class1, BIOS F0E3A12 08/26/2021 RIP: 0010:bnxtrxpkt+0xe97/0x1ae0 [bnxten] Code: 8b 95 70 ff ff ff 4c 8b 9d 48 ff ff ff 66 41 89 87 b4 00 00 00 e9 0b f7 ff ff 0f b7 43 0a 49 8b 95 a8 04 00 00 25 ff 0f 00 00 <0f> b7 14 42 48 c1 e2 06 49 03 95 a0 04 00 00 0f b6 42 33f RSP: 0018:ffffa19f40cc0d18 EFLAGS: 00010202 RAX: 00000000000001e0 RBX: ffff8e2c805c6100 RCX: 00000000000007ff RDX: 0000000000000000 RSI: ffff8e2c271ab990 RDI: ffff8e2c84f12380 RBP: ffffa19f40cc0e48 R08: 000000000001000d R09: 974ea2fcddfa4cbf R10: 0000000000000000 R11: ffffa19f40cc0ff8 R12: ffff8e2c94b58980 R13: ffff8e2c952d6600 R14: 0000000000000016 R15: ffff8e2c271ab990 FS: 0000000000000000(0000) GS:ffff8e3b3f840000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000003c0 CR3: 0000000e8580a004 CR4: 00000000007706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <IRQ> _bnxtpollwork+0x1c2/0x3e0 [bnxten]
To address the issue, we now call bnxtsetrxskbmode() within bnxtchangemtu() to properly set the AGG rings configuration and update rxskbfunc based on the new MTU value. Additionally, BNXTFLAGNOAGGRINGS is cleared at the beginning of bnxtsetrxskbmode() to make sure it gets set or cleared based on the current MTU.(CVE-2024-53209)
In the Linux kernel, the following vulnerability has been resolved:
clk: ralink: mtmips: fix clocks probe order in oldest ralink SoCs
Base clocks are the first in being probed and are real dependencies of the rest of fixed, factor and peripheral clocks. For old ralink SoCs RT2880, RT305x and RT3883 'xtal' must be defined first since in any other case, when fixed clocks are probed they are delayed until 'xtal' is probed so the following warning appears:
WARNING: CPU: 0 PID: 0 at drivers/clk/ralink/clk-mtmips.c:499 rt3883busrecalcrate+0x98/0x138 Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 6.6.43 #0 Stack : 805e58d0 00000000 00000004 8004f950 00000000 00000004 00000000 00000000 80669c54 80830000 80700000 805ae570 80670068 00000001 80669bf8 00000000 00000000 00000000 805ae570 80669b38 00000020 804db7dc 00000000 00000000 203a6d6d 80669b78 80669e48 70617773 00000000 805ae570 00000000 00000009 00000000 00000001 00000004 00000001 00000000 00000000 83fe43b0 00000000 ... Call Trace: [<800065d0>] showstack+0x64/0xf4 [<804bca14>] dumpstacklvl+0x38/0x60 [<800218ac>] _warn+0x94/0xe4 [<8002195c>] warnslowpathfmt+0x60/0x94 [<80259ff8>] rt3883busrecalcrate+0x98/0x138 [<80254530>] _clkregister+0x568/0x688 [<80254838>] ofclkhwregister+0x18/0x2c [<8070b910>] rt2880clkofclkinitdriver+0x18c/0x594 [<8070b628>] ofclkinit+0x1c0/0x23c [<806fc448>] plattimeinit+0x58/0x18c [<806fdaf0>] timeinit+0x10/0x6c [<806f9bc4>] startkernel+0x458/0x67c
---[ end trace 0000000000000000 ]---
When this driver was mainlined we could not find any active users of old ralink SoCs so we cannot perform any real tests for them. Now, one user of a Belkin f9k1109 version 1 device which uses RT3883 SoC appeared and reported some issues in openWRT: - https://github.com/openwrt/openwrt/issues/16054
Thus, define a 'rt2880xtalrecalc_rate()' just returning the expected frequency 40Mhz and use it along the old ralink SoCs to have a correct boot trace with no warnings and a working clock plan from the beggining.(CVE-2024-53223)
In the Linux kernel, the following vulnerability has been resolved:
Bluetooth: fix use-after-free in deviceforeach_child()
Syzbot has reported the following KASAN splat:
BUG: KASAN: slab-use-after-free in deviceforeach_child+0x18f/0x1a0 Read of size 8 at addr ffff88801f605308 by task kbnepd bnep0/4980
CPU: 0 UID: 0 PID: 4980 Comm: kbnepd bnep0 Not tainted 6.12.0-rc4-00161-gae90f6a6170d #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-2.fc40 04/01/2014 Call Trace: <TASK> dumpstacklvl+0x100/0x190 ? deviceforeachchild+0x18f/0x1a0 printreport+0x13a/0x4cb ? _virtaddrvalid+0x5e/0x590 ? _physaddr+0xc6/0x150 ? deviceforeachchild+0x18f/0x1a0 kasanreport+0xda/0x110 ? deviceforeachchild+0x18f/0x1a0 ? _pfxdevmemallocnoio+0x10/0x10 deviceforeachchild+0x18f/0x1a0 ? _pfxdeviceforeachchild+0x10/0x10 pmruntimesetmemallocnoio+0xf2/0x180 netdevunregisterkobject+0x1ed/0x270 unregisternetdevicemanynotify+0x123c/0x1d80 ? _mutextrylockcommon+0xde/0x250 ? _pfxunregisternetdevicemanynotify+0x10/0x10 ? tracecontentionend+0xe6/0x140 ? _mutexlock+0x4e7/0x8f0 ? _pfxlockacquire.part.0+0x10/0x10 ? rcuiswatching+0x12/0xc0 ? unregisternetdev+0x12/0x30 unregisternetdevicequeue+0x30d/0x3f0 ? _pfxunregisternetdevicequeue+0x10/0x10 ? _pfxdownwrite+0x10/0x10 unregisternetdev+0x1c/0x30 bnepsession+0x1fb3/0x2ab0 ? _pfxbnepsession+0x10/0x10 ? _pfxlockrelease+0x10/0x10 ? _pfxwokenwakefunction+0x10/0x10 ? _kthreadparkme+0x132/0x200 ? _pfxbnepsession+0x10/0x10 ? kthread+0x13a/0x370 ? _pfxbnepsession+0x10/0x10 kthread+0x2b7/0x370 ? _pfxkthread+0x10/0x10 retfromfork+0x48/0x80 ? _pfxkthread+0x10/0x10 retfromfork_asm+0x1a/0x30 </TASK>
Allocated by task 4974: kasansavestack+0x30/0x50 kasansavetrack+0x14/0x30 _kasankmalloc+0xaa/0xb0 _kmallocnoprof+0x1d1/0x440 hciallocdevpriv+0x1d/0x2820 _vhcicreatedevice+0xef/0x7d0 vhciwrite+0x2c7/0x480 vfswrite+0x6a0/0xfc0 ksyswrite+0x12f/0x260 dosyscall64+0xc7/0x250 entrySYSCALL64after_hwframe+0x77/0x7f
Freed by task 4979: kasansavestack+0x30/0x50 kasansavetrack+0x14/0x30 kasansavefreeinfo+0x3b/0x60 _kasanslabfree+0x4f/0x70 kfree+0x141/0x490 hcireleasedev+0x4d9/0x600 bthostrelease+0x6a/0xb0 devicerelease+0xa4/0x240 kobjectput+0x1ec/0x5a0 putdevice+0x1f/0x30 vhcirelease+0x81/0xf0 _fput+0x3f6/0xb30 taskworkrun+0x151/0x250 doexit+0xa79/0x2c30 dogroupexit+0xd5/0x2a0 getsignal+0x1fcd/0x2210 archdosignalorrestart+0x93/0x780 syscallexittousermode+0x140/0x290 dosyscall64+0xd4/0x250 entrySYSCALL64after_hwframe+0x77/0x7f
In 'hciconndelsysfs()', 'deviceunregister()' may be called when an underlying (kobject) reference counter is greater than 1. This means that reparenting (happened when the device is actually freed) is delayed and, during that delay, parent controller device (hciX) may be deleted. Since the latter may create a dangling pointer to freed parent, avoid that scenario by reparenting to NULL explicitly.(CVE-2024-53237)
In the Linux kernel, the following vulnerability has been resolved:
accel/ivpu: Fix WARN in ivpuipcsendreceiveinternal()
Move pmruntimesetactive() to ivpupminit() so when ivpuipcsendreceiveinternal() is executed before ivpupm_enable() it already has correct runtime state, even if last resume was not successful..(CVE-2024-54193)
In the Linux kernel, the following vulnerability has been resolved:
iio: adc: ad7923: Fix buffer overflow for txbuf and ringxfer
The AD7923 was updated to support devices with 8 channels, but the size of txbuf and ringxfer was not increased accordingly, leading to a potential buffer overflow in ad7923updatescan_mode().(CVE-2024-56557)
In the Linux kernel, the following vulnerability has been resolved:
ad7780: fix division by zero in ad7780writeraw()
In the ad7780writeraw() , val2 can be zero, which might lead to a division by zero error in DIVROUNDCLOSEST(). The ad7780writeraw() is based on iioinfo's writeraw. While val is explicitly declared that can be zero (in read mode), val2 is not specified to be non-zero.(CVE-2024-56567)
In the Linux kernel, the following vulnerability has been resolved:
scsi: hisi_sas: Create all dump files during debugfs initialization
For the current debugfs of hisi_sas, after user triggers dump, the driver allocate memory space to save the register information and create debugfs files to display the saved information. In this process, the debugfs files created after each dump.
Therefore, when the dump is triggered while the driver is unbind, the following hang occurs:
[67840.853907] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000a0 [67840.862947] Mem abort info: [67840.865855] ESR = 0x0000000096000004 [67840.869713] EC = 0x25: DABT (current EL), IL = 32 bits [67840.875125] SET = 0, FnV = 0 [67840.878291] EA = 0, S1PTW = 0 [67840.881545] FSC = 0x04: level 0 translation fault [67840.886528] Data abort info: [67840.889524] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 [67840.895117] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [67840.900284] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [67840.905709] user pgtable: 4k pages, 48-bit VAs, pgdp=0000002803a1f000 [67840.912263] [00000000000000a0] pgd=0000000000000000, p4d=0000000000000000 [67840.919177] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP [67840.996435] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [67841.003628] pc : downwrite+0x30/0x98 [67841.007546] lr : startcreating.part.0+0x60/0x198 [67841.012495] sp : ffff8000b979ba20 [67841.016046] x29: ffff8000b979ba20 x28: 0000000000000010 x27: 0000000000024b40 [67841.023412] x26: 0000000000000012 x25: ffff20202b355ae8 x24: ffff20202b35a8c8 [67841.030779] x23: ffffa36877928208 x22: ffffa368b4972240 x21: ffff8000b979bb18 [67841.038147] x20: ffff00281dc1e3c0 x19: fffffffffffffffe x18: 0000000000000020 [67841.045515] x17: 0000000000000000 x16: ffffa368b128a530 x15: ffffffffffffffff [67841.052888] x14: ffff8000b979bc18 x13: ffffffffffffffff x12: ffff8000b979bb18 [67841.060263] x11: 0000000000000000 x10: 0000000000000000 x9 : ffffa368b1289b18 [67841.067640] x8 : 0000000000000012 x7 : 0000000000000000 x6 : 00000000000003a9 [67841.075014] x5 : 0000000000000000 x4 : ffff002818c5cb00 x3 : 0000000000000001 [67841.082388] x2 : 0000000000000000 x1 : ffff002818c5cb00 x0 : 00000000000000a0 [67841.089759] Call trace: [67841.092456] downwrite+0x30/0x98 [67841.096017] startcreating.part.0+0x60/0x198 [67841.100613] debugfscreatedir+0x48/0x1f8 [67841.104950] debugfscreatefilesv3hw+0x88/0x348 [hisisasv3hw] [67841.111447] debugfssnapshotregsv3hw+0x708/0x798 [hisisasv3hw] [67841.118111] debugfstriggerdumpv3hwwrite+0x9c/0x120 [hisisasv3hw] [67841.125115] fullproxywrite+0x68/0xc8 [67841.129175] vfswrite+0xd8/0x3f0 [67841.132708] ksyswrite+0x70/0x108 [67841.136317] _arm64syswrite+0x24/0x38 [67841.140440] invokesyscall+0x50/0x128 [67841.144385] el0svccommon.constprop.0+0xc8/0xf0 [67841.149273] doel0svc+0x24/0x38 [67841.152773] el0svc+0x38/0xd8 [67841.156009] el0t64synchandler+0xc0/0xc8 [67841.160361] el0t64sync+0x1a4/0x1a8 [67841.164189] Code: b9000882 d2800002 d2800023 f9800011 (c85ffc05) [67841.170443] ---[ end trace 0000000000000000 ]---
To fix this issue, create all directories and files during debugfs initialization. In this way, the driver only needs to allocate memory space to save information each time the user triggers dumping.(CVE-2024-56588)
In the Linux kernel, the following vulnerability has been resolved:
scsi: hisisas: Add condresched() for no forced preemption model
For no forced preemption model kernel, in the scenario where the expander is connected to 12 high performance SAS SSDs, the following call trace may occur:
[ 214.409199][ C240] watchdog: BUG: soft lockup - CPU#240 stuck for 22s! [irq/149-hisisa:3211] [ 214.568533][ C240] pstate: 60400009 (nZCv daif +PAN -UAO -TCO BTYPE=--) [ 214.575224][ C240] pc : fputmany+0x8c/0xdc [ 214.579480][ C240] lr : fput+0x1c/0xf0 [ 214.583302][ C240] sp : ffff80002de2b900 [ 214.587298][ C240] x29: ffff80002de2b900 x28: ffff1082aa412000 [ 214.593291][ C240] x27: ffff3062a0348c08 x26: ffff80003a9f6000 [ 214.599284][ C240] x25: ffff1062bbac5c40 x24: 0000000000001000 [ 214.605277][ C240] x23: 000000000000000a x22: 0000000000000001 [ 214.611270][ C240] x21: 0000000000001000 x20: 0000000000000000 [ 214.617262][ C240] x19: ffff3062a41ae580 x18: 0000000000010000 [ 214.623255][ C240] x17: 0000000000000001 x16: ffffdb3a6efe5fc0 [ 214.629248][ C240] x15: ffffffffffffffff x14: 0000000003ffffff [ 214.635241][ C240] x13: 000000000000ffff x12: 000000000000029c [ 214.641234][ C240] x11: 0000000000000006 x10: ffff80003a9f7fd0 [ 214.647226][ C240] x9 : ffffdb3a6f0482fc x8 : 0000000000000001 [ 214.653219][ C240] x7 : 0000000000000002 x6 : 0000000000000080 [ 214.659212][ C240] x5 : ffff55480ee9b000 x4 : fffffde7f94c6554 [ 214.665205][ C240] x3 : 0000000000000002 x2 : 0000000000000020 [ 214.671198][ C240] x1 : 0000000000000021 x0 : ffff3062a41ae5b8 [ 214.677191][ C240] Call trace: [ 214.680320][ C240] fputmany+0x8c/0xdc [ 214.684230][ C240] fput+0x1c/0xf0 [ 214.687707][ C240] aiocompleterw+0xd8/0x1fc [ 214.692225][ C240] blkdevbioendio+0x98/0x140 [ 214.696917][ C240] bioendio+0x160/0x1bc [ 214.701001][ C240] blkupdaterequest+0x1c8/0x3bc [ 214.705867][ C240] scsiendrequest+0x3c/0x1f0 [ 214.710471][ C240] scsiiocompletion+0x7c/0x1a0 [ 214.715249][ C240] scsifinishcommand+0x104/0x140 [ 214.720200][ C240] scsisoftirqdone+0x90/0x180 [ 214.724892][ C240] blkmqcompleterequest+0x5c/0x70 [ 214.730016][ C240] scsimqdone+0x48/0xac [ 214.734194][ C240] sasscsitaskdone+0xbc/0x16c [libsas] [ 214.739758][ C240] slotcompletev3hw+0x260/0x760 [hisisasv3hw] [ 214.746185][ C240] cqthreadv3hw+0xbc/0x190 [hisisasv3hw] [ 214.752179][ C240] irqthreadfn+0x34/0xa4 [ 214.756435][ C240] irqthread+0xc4/0x130 [ 214.760520][ C240] kthread+0x108/0x13c [ 214.764430][ C240] retfromfork+0x10/0x18
This is because in the hisisas driver, both the hardware interrupt handler and the interrupt thread are executed on the same CPU. In the performance test scenario, function irqwaitforinterrupt() will always return 0 if lots of interrupts occurs and the CPU will be continuously consumed. As a result, the CPU cannot run the watchdog thread. When the watchdog time exceeds the specified time, call trace occurs.
To fix it, add cond_resched() to execute the watchdog thread.(CVE-2024-56589)
In the Linux kernel, the following vulnerability has been resolved:
Bluetooth: hcicore: Fix not checking skb length on hciacldata_packet
This fixes not checking if skb really contains an ACL header otherwise the code may attempt to access some uninitilized/invalid memory past the valid skb->data.(CVE-2024-56590)
In the Linux kernel, the following vulnerability has been resolved:
xsk: fix OOB map writes when deleting elements
Jordy says:
" In the xskmapdeleteelem function an unsigned integer (map->maxentries) is compared with a user-controlled signed integer (k). Due to implicit type conversion, a large unsigned value for map->max_entries can bypass the intended bounds check:
if (k >= map->max_entries)
return -EINVAL;
This allows k to hold a negative value (between -2147483648 and -2), which is then used as an array index in m->xsk_map[k], which results in an out-of-bounds access.
spin_lock_bh(&m->lock);
map_entry = &m->xsk_map[k]; // Out-of-bounds map_entry
old_xs = unrcu_pointer(xchg(map_entry, NULL)); // Oob write
if (old_xs)
xsk_map_sock_delete(old_xs, map_entry);
spin_unlock_bh(&m->lock);
The xchg operation can then be used to cause an out-of-bounds write. Moreover, the invalid mapentry passed to xskmapsockdelete can lead to further memory corruption. "
It indeed results in following splat:
[76612.897343] BUG: unable to handle page fault for address: ffffc8fc2e461108 [76612.904330] #PF: supervisor write access in kernel mode [76612.909639] #PF: errorcode(0x0002) - not-present page [76612.914855] PGD 0 P4D 0 [76612.917431] Oops: Oops: 0002 [#1] PREEMPT SMP [76612.921859] CPU: 11 UID: 0 PID: 10318 Comm: a.out Not tainted 6.12.0-rc1+ #470 [76612.929189] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0008.031920191559 03/19/2019 [76612.939781] RIP: 0010:xskmapdeleteelem+0x2d/0x60 [76612.944738] Code: 00 00 41 54 55 53 48 63 2e 3b 6f 24 73 38 4c 8d a7 f8 00 00 00 48 89 fb 4c 89 e7 e8 2d bf 05 00 48 8d b4 eb 00 01 00 00 31 ff <48> 87 3e 48 85 ff 74 05 e8 16 ff ff ff 4c 89 e7 e8 3e bc 05 00 31 [76612.963774] RSP: 0018:ffffc9002e407df8 EFLAGS: 00010246 [76612.969079] RAX: 0000000000000000 RBX: ffffc9002e461000 RCX: 0000000000000000 [76612.976323] RDX: 0000000000000001 RSI: ffffc8fc2e461108 RDI: 0000000000000000 [76612.983569] RBP: ffffffff80000001 R08: 0000000000000000 R09: 0000000000000007 [76612.990812] R10: ffffc9002e407e18 R11: ffff888108a38858 R12: ffffc9002e4610f8 [76612.998060] R13: ffff888108a38858 R14: 00007ffd1ae0ac78 R15: ffffc9002e4610c0 [76613.005303] FS: 00007f80b6f59740(0000) GS:ffff8897e0ec0000(0000) knlGS:0000000000000000 [76613.013517] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [76613.019349] CR2: ffffc8fc2e461108 CR3: 000000011e3ef001 CR4: 00000000007726f0 [76613.026595] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [76613.033841] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [76613.041086] PKRU: 55555554 [76613.043842] Call Trace: [76613.046331] <TASK> [76613.048468] ? _die+0x20/0x60 [76613.051581] ? pagefaultoops+0x15a/0x450 [76613.055747] ? searchextable+0x22/0x30 [76613.059649] ? searchbpfextables+0x5f/0x80 [76613.063988] ? excpagefault+0xa9/0x140 [76613.067975] ? asmexcpagefault+0x22/0x30 [76613.072229] ? xskmapdeleteelem+0x2d/0x60 [76613.076573] ? xskmapdeleteelem+0x23/0x60 [76613.080914] _sysbpf+0x19b7/0x23c0 [76613.084555] _x64sysbpf+0x1a/0x20 [76613.088194] dosyscall64+0x37/0xb0 [76613.091832] entrySYSCALL64afterhwframe+0x4b/0x53 [76613.096962] RIP: 0033:0x7f80b6d1e88d [76613.100592] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 73 b5 0f 00 f7 d8 64 89 01 48 [76613.119631] RSP: 002b:00007ffd1ae0ac68 EFLAGS: 00000206 ORIG_RAX: 0000000000000141 [76613.131330] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f80b6d1e88d [76613.142632] RDX: 0000000000000098 RSI: 00007ffd1ae0ad20 RDI: 0000000000000003 [76613.153967] RBP: 00007ffd1ae0adc0 R08: 0000000000000000 R09: 0000000000000000 [76613.166030] R10: 00007f80b6f77040 R11: 0000000000000206 R12: 00007ffd1ae0aed8 [76613.177130] R13: 000055ddf42ce1e9 R14: 000055ddf42d0d98 R15: 00 ---truncated---(CVE-2024-56614)
In the Linux kernel, the following vulnerability has been resolved:
scsi: qla2xxx: Fix use after free on unload
System crash is observed with stack trace warning of use after free. There are 2 signals to tell dpcthread to terminate (UNLOADING flag and kthreadstop).
On setting the UNLOADING flag when dpcthread happens to run at the time and sees the flag, this causes dpcthread to exit and clean up itself. When kthread_stop is called for final cleanup, this causes use after free.
Remove UNLOADING signal to terminate dpcthread. Use the kthreadstop as the main signal to exit dpc_thread.
[596663.812935] kernel BUG at mm/slub.c:294! [596663.812950] invalid opcode: 0000 [#1] SMP PTI [596663.812957] CPU: 13 PID: 1475935 Comm: rmmod Kdump: loaded Tainted: G IOE --------- - - 4.18.0-240.el8.x8664 #1 [596663.812960] Hardware name: HP ProLiant DL380p Gen8, BIOS P70 08/20/2012 [596663.812974] RIP: 0010:slabfree+0x17d/0x360
... [596663.813008] Call Trace: [596663.813022] ? _dentrykill+0x121/0x170 [596663.813030] ? condresched+0x15/0x30 [596663.813034] ? condresched+0x15/0x30 [596663.813039] ? waitforcompletion+0x35/0x190 [596663.813048] ? trytowakeup+0x63/0x540 [596663.813055] freetask+0x5a/0x60 [596663.813061] kthreadstop+0xf3/0x100 [596663.813103] qla2x00remove_one+0x284/0x440 qla2xxx
In the Linux kernel, the following vulnerability has been resolved:
net/smc: fix LGR and link use-after-free issue
We encountered a LGR/link use-after-free issue, which manifested as the LGR/link refcnt reaching 0 early and entering the clear process, making resource access unsafe.
refcountt: addition on 0; use-after-free. WARNING: CPU: 14 PID: 107447 at lib/refcount.c:25 refcountwarnsaturate+0x9c/0x140 Workqueue: events smclgrterminatework [smc] Call trace: refcountwarnsaturate+0x9c/0x140 _smclgrterminate.part.45+0x2a8/0x370 [smc] smclgrterminatework+0x28/0x30 [smc] processonework+0x1b8/0x420 worker_thread+0x158/0x510 kthread+0x114/0x118
or
refcountt: underflow; use-after-free. WARNING: CPU: 6 PID: 93140 at lib/refcount.c:28 refcountwarnsaturate+0xf0/0x140 Workqueue: smchswq smclistenwork [smc] Call trace: refcountwarnsaturate+0xf0/0x140 smcrlinkput+0x1cc/0x1d8 [smc] smcconnfree+0x110/0x1b0 [smc] smcconnabort+0x50/0x60 [smc] smclistenfinddevice+0x75c/0x790 [smc] smclistenwork+0x368/0x8a0 [smc] processonework+0x1b8/0x420 worker_thread+0x158/0x510 kthread+0x114/0x118
It is caused by repeated release of LGR/link refcnt. One suspect is that smcconnfree() is called repeatedly because some smcconnfree() from server listening path are not protected by sock lock.
e.g.
locksock(sk) | smcconnabort smcconnfree | - smcconnfree - smcrlinkput | - smcrlinkput (duplicated) releasesock(sk)
So here add sock lock protection in smclistenwork() path, making it exclusive with other connection operations.(CVE-2024-56640)
In the Linux kernel, the following vulnerability has been resolved:
net/smc: initialize close_work early to avoid warning
We encountered a warning that close_work was canceled before initialization.
WARNING: CPU: 7 PID: 111103 at kernel/workqueue.c:3047 flushwork+0x19e/0x1b0 Workqueue: events smclgrterminatework [smc] RIP: 0010:flushwork+0x19e/0x1b0 Call Trace: ? _wakeupcommon+0x7a/0x190 ? workbusy+0x80/0x80 _cancelworktimer+0xe3/0x160 smcclosecancelwork+0x1a/0x70 [smc] smccloseactiveabort+0x207/0x360 [smc] _smclgrterminate.part.38+0xc8/0x180 [smc] processonework+0x19e/0x340 workerthread+0x30/0x370 ? processonework+0x340/0x340 kthread+0x117/0x130 ? _kthreadcancelwork+0x50/0x50 retfrom_fork+0x22/0x30
This is because when smcclosecancelwork is triggered, e.g. the RDMA driver is rmmod and the LGR is terminated, the conn->closework is flushed before initialization, resulting in WARN_ON(!work->func).
| smc_conn_create
| \- smc_lgr_register_conn
for conn in lgr->connsall | - smcconnkill | - smccloseactiveabort | - smcclosecancelwork | - cancelworksync | - _flushwork | (closework) | | smccloseinit | - INITWORK(&closework)
So fix this by initializing close_work before establishing the connection.(CVE-2024-56641)
In the Linux kernel, the following vulnerability has been resolved:
Bluetooth: btmtk: avoid UAF in btmtkprocesscoredump
hcidevcdappend may lead to the release of the skb, so it cannot be accessed once it is called.
================================================================== BUG: KASAN: slab-use-after-free in btmtkprocesscoredump+0x2a7/0x2d0 [btmtk] Read of size 4 at addr ffff888033cfabb0 by task kworker/0:3/82
CPU: 0 PID: 82 Comm: kworker/0:3 Tainted: G U 6.6.40-lockdep-03464-g1d8b4eb3060e #1 b0b3c1cc0c842735643fb411799d97921d1f688c Hardware name: Google YaviksUfs/YaviksUfs, BIOS GoogleYaviksUfs.15217.552.0 05/07/2024 Workqueue: events btusbrxwork [btusb] Call Trace: <TASK> dumpstacklvl+0xfd/0x150 printreport+0x131/0x780 kasanreport+0x177/0x1c0 btmtkprocesscoredump+0x2a7/0x2d0 [btmtk 03edd567dd71a65958807c95a65db31d433e1d01] btusbrecvaclmtk+0x11c/0x1a0 [btusb 675430d1e87c4f24d0c1f80efe600757a0f32bec] btusbrxwork+0x9e/0xe0 [btusb 675430d1e87c4f24d0c1f80efe600757a0f32bec] workerthread+0xe44/0x2cc0 kthread+0x2ff/0x3a0 retfromfork+0x51/0x80 retfromfork_asm+0x1b/0x30 </TASK>
Allocated by task 82: stacktracesave+0xdc/0x190 kasansettrack+0x4e/0x80 _kasanslaballoc+0x4e/0x60 kmemcachealloc+0x19f/0x360 skbclone+0x132/0xf70 btusbrecvaclmtk+0x104/0x1a0 [btusb] btusbrxwork+0x9e/0xe0 [btusb] workerthread+0xe44/0x2cc0 kthread+0x2ff/0x3a0 retfromfork+0x51/0x80 retfromfork_asm+0x1b/0x30
Freed by task 1733: stacktracesave+0xdc/0x190 kasansettrack+0x4e/0x80 kasansavefreeinfo+0x28/0xb0 __kasanslabfree+0xfd/0x170 kmemcachefree+0x183/0x3f0 hcidevcdrx+0x91a/0x2060 [bluetooth] workerthread+0xe44/0x2cc0 kthread+0x2ff/0x3a0 retfromfork+0x51/0x80 retfromfork_asm+0x1b/0x30
The buggy address belongs to the object at ffff888033cfab40 which belongs to the cache skbuffheadcache of size 232 The buggy address is located 112 bytes inside of freed 232-byte region [ffff888033cfab40, ffff888033cfac28)
The buggy address belongs to the physical page: page:00000000a174ba93 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x33cfa head:00000000a174ba93 order:1 entiremapcount:0 nrpagesmapped:0 pincount:0 anon flags: 0x4000000000000840(slab|head|zone=1) pagetype: 0xffffffff() raw: 4000000000000840 ffff888100848a00 0000000000000000 0000000000000001 raw: 0000000000000000 0000000080190019 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected
Memory state around the buggy address: ffff888033cfaa80: fb fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc ffff888033cfab00: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb >ffff888033cfab80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff888033cfac00: fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc
Check if we need to call hcidevcdcomplete before calling hcidevcdappend. That requires that we check data->cdinfo.cnt >= MTKCOREDUMPNUM instead of data->cdinfo.cnt > MTKCOREDUMPNUM, as we increment data->cdinfo.cnt only once the call to hcidevcd_append succeeds.(CVE-2024-56653)
In the Linux kernel, the following vulnerability has been resolved:
powerpc/fadump: Move fadumpcmainit to setuparch() after initmeminit()
During early init CMAMINALIGNMENTBYTES can be PAGESIZE, since pageblockorder is still zero and it gets initialized later during initmeminit() e.g. setuparch() -> initmeminit() -> sparseinit() -> setpageblock_order()
One such use case where this causes issue is - earlysetup() -> earlyinitdevtree() -> fadumpreservemem() -> fadumpcma_init()
This causes CMA memory alignment check to be bypassed in cmainitreservedmem(). Then later cmaactivatearea() can hit a VMBUGONPAGE(pfn & ((1 << order) - 1)) if the reserved memory area was not pageblock_order aligned.
Fix it by moving the fadumpcmainit() after initmem_init(), where other such cma reservations also gets called.
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x10010 flags: 0x13ffff800000000(node=1|zone=0|lastcpupid=0x7ffff) CMA raw: 013ffff800000000 5deadbeef0000100 5deadbeef0000122 0000000000000000 raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: VMBUGONPAGE(pfn & ((1 << order) - 1)) ------------[ cut here ]------------ kernel BUG at mm/pagealloc.c:778!
Call Trace: _freeonepage+0x57c/0x7b0 (unreliable) freepcppagesbulk+0x1a8/0x2c8 freeunrefpagecommit+0x3d4/0x4e4 freeunrefpage+0x458/0x6d0 initcmareservedpageblock+0x114/0x198 cmainitreservedareas+0x270/0x3e0 dooneinitcall+0x80/0x2f8 kernelinitfreeable+0x33c/0x530 kernelinit+0x34/0x26c retfromkerneluser_thread+0x14/0x1c(CVE-2024-56677)
In the Linux kernel, the following vulnerability has been resolved:
usb: musb: Fix hardware lockup on first Rx endpoint request
There is a possibility that a request's callback could be invoked from usbepqueue() (call trace below, supplemented with missing calls):
req->complete from usbgadgetgivebackrequest (drivers/usb/gadget/udc/core.c:999) usbgadgetgivebackrequest from musbggiveback (drivers/usb/musb/musbgadget.c:147) musbggiveback from rxstate (drivers/usb/musb/musbgadget.c:784) rxstate from musbeprestart (drivers/usb/musb/musbgadget.c:1169) musbeprestart from musbeprestartresumework (drivers/usb/musb/musbgadget.c:1176) musbeprestartresumework from musbqueueresumework (drivers/usb/musb/musbcore.c:2279) musbqueueresumework from musbgadgetqueue (drivers/usb/musb/musbgadget.c:1241) musbgadgetqueue from usbepqueue (drivers/usb/gadget/udc/core.c:300)
According to the docstring of usbepqueue(), this should not happen:
"Note that @req's ->complete() callback must never be called from within usbepqueue() as that can create deadlock situations."
In fact, a hardware lockup might occur in the following sequence:
For this scenario to occur, it is only necessary for IRQs to be enabled at some point during the complete callback. This happens with the USB Ethernet gadget, whose rxcomplete() callback calls netifrx(). If called in the task context, netifrx() disables the bottom halves (BHs). When the BHs are re-enabled, IRQs are also enabled to allow soft IRQs to be processed. The gadget itself is initialized at module load (or at boot if built-in), but the first request is enqueued when the network interface is brought up, triggering rxcomplete() in the task context via ioctl(). If a packet arrives while the interface is down, it can prevent the interface from receiving any further packets from the USB host.
The situation is quite complicated with many parties involved. This particular issue can be resolved in several possible ways:
In the Linux kernel, the following vulnerability has been resolved:
sunrpc: clear XPRTSOCKUPD_TIMEOUT when reset transport
Since transport->sock has been set to NULL during reset transport, XPRTSOCKUPDTIMEOUT also needs to be cleared. Otherwise, the xstcpsetsockettimeouts() may be triggered in xstcpsendrequest() to dereference the transport->sock that has been set to NULL.(CVE-2024-56688)
In the Linux kernel, the following vulnerability has been resolved:
powerpc/pseries: Fix dtlaccesslock to be a rw_semaphore
The dtlaccesslock needs to be a rw_sempahore, a sleeping lock, because the code calls kmalloc() while holding it, which can sleep:
# echo 1 > /proc/powerpc/vcpudispatchstats BUG: sleeping function called from invalid context at include/linux/sched/mm.h:337 inatomic(): 1, irqsdisabled(): 0, nonblock: 0, pid: 199, name: sh preemptcount: 1, expected: 0 3 locks held by sh/199: #0: c00000000a0743f8 (sbwriters#3){.+.+}-{0:0}, at: vfswrite+0x324/0x438 #1: c0000000028c7058 (dtlenablemutex){+.+.}-{3:3}, at: vcpudispatchstatswrite+0xd4/0x5f4 #2: c0000000028c70b8 (dtlaccesslock){+.+.}-{2:2}, at: vcpudispatchstatswrite+0x220/0x5f4 CPU: 0 PID: 199 Comm: sh Not tainted 6.10.0-rc4 #152 Hardware name: IBM pSeries (emulated by qemu) POWER9 (raw) 0x4e1202 0xf000005 of:SLOF,HEAD hv:linux,kvm pSeries Call Trace: dumpstacklvl+0x130/0x148 (unreliable) _mightresched+0x174/0x410 kmemcacheallocnoprof+0x340/0x3d0 allocdtlbuffers+0x124/0x1ac vcpudispatchstatswrite+0x2a8/0x5f4 procregwrite+0xf4/0x150 vfswrite+0xfc/0x438 ksyswrite+0x88/0x148 systemcallexception+0x1c4/0x5a0 systemcallcommon+0xf4/0x258(CVE-2024-56701)
In the Linux kernel, the following vulnerability has been resolved:
net/smc: protect link down work from execute after lgr freed
link down work may be scheduled before lgr freed but execute after lgr freed, which may result in crash. So it is need to hold a reference before shedule link down work, and put the reference after work executed or canceled.
The relevant crash call stack as follows: listdel corruption. prev->next should be ffffb638c9c0fe20, but was 0000000000000000 ------------[ cut here ]------------ kernel BUG at lib/listdebug.c:51! invalid opcode: 0000 [#1] SMP NOPTI CPU: 6 PID: 978112 Comm: kworker/6:119 Kdump: loaded Tainted: G #1 Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 2221b89 04/01/2014 Workqueue: events smclinkdownwork [smc] RIP: 0010:listdelentryvalid.cold+0x31/0x47 RSP: 0018:ffffb638c9c0fdd8 EFLAGS: 00010086 RAX: 0000000000000054 RBX: ffff942fb75e5128 RCX: 0000000000000000 RDX: ffff943520930aa0 RSI: ffff94352091fc80 RDI: ffff94352091fc80 RBP: 0000000000000000 R08: 0000000000000000 R09: ffffb638c9c0fc38 R10: ffffb638c9c0fc30 R11: ffffffffa015eb28 R12: 0000000000000002 R13: ffffb638c9c0fe20 R14: 0000000000000001 R15: ffff942f9cd051c0 FS: 0000000000000000(0000) GS:ffff943520900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f4f25214000 CR3: 000000025fbae004 CR4: 00000000007706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: rwsemdownwriteslowpath+0x17e/0x470 smclinkdownwork+0x3c/0x60 [smc] processonework+0x1ac/0x350 workerthread+0x49/0x2f0 ? rescuerthread+0x360/0x360 kthread+0x118/0x140 ? _kthreadbindmask+0x60/0x60 retfrom_fork+0x1f/0x30(CVE-2024-56718)
In the Linux kernel, the following vulnerability has been resolved:
smb: Initialize cfid->tcon before performing network ops
Avoid leaking a tcon ref when a lease break races with opening the cached directory. Processing the leak break might take a reference to the tcon in cacheddirleasebreak() and then fail to release the ref in cacheddiroffloadclose, since cfid->tcon is still NULL.(CVE-2024-56729)
In the Linux kernel, the following vulnerability has been resolved:
btrfs: check folio mapping after unlock in relocateonefolio()
When we call btrfsreadfolio() to bring a folio uptodate, we unlock the folio. The result of that is that a different thread can modify the mapping (like remove it with invalidate) before we call folio_lock(). This results in an invalid page and we need to try again.
In particular, if we are relocating concurrently with aborting a transaction, this can result in a crash like the following:
BUG: kernel NULL pointer dereference, address: 0000000000000000 PGD 0 P4D 0 Oops: 0000 [#1] SMP CPU: 76 PID: 1411631 Comm: kworker/u322:5 Workqueue: eventsunbound btrfsreclaimbgswork RIP: 0010:setpageextentmapped+0x20/0xb0 RSP: 0018:ffffc900516a7be8 EFLAGS: 00010246 RAX: ffffea009e851d08 RBX: ffffea009e0b1880 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffffc900516a7b90 RDI: ffffea009e0b1880 RBP: 0000000003573000 R08: 0000000000000001 R09: ffff88c07fd2f3f0 R10: 0000000000000000 R11: 0000194754b575be R12: 0000000003572000 R13: 0000000003572fff R14: 0000000000100cca R15: 0000000005582fff FS: 0000000000000000(0000) GS:ffff88c07fd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000407d00f002 CR4: 00000000007706f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> ? _die+0x78/0xc0 ? pagefaultoops+0x2a8/0x3a0 ? _switchto+0x133/0x530 ? wqworkerrunning+0xa/0x40 ? excpagefault+0x63/0x130 ? asmexcpagefault+0x22/0x30 ? setpageextentmapped+0x20/0xb0 relocatefileextentcluster+0x1a7/0x940 relocatedataextent+0xaf/0x120 relocateblockgroup+0x20f/0x480 btrfsrelocateblockgroup+0x152/0x320 btrfsrelocatechunk+0x3d/0x120 btrfsreclaimbgswork+0x2ae/0x4e0 processscheduledworks+0x184/0x370 workerthread+0xc6/0x3e0 ? blkaddtimer+0xb0/0xb0 kthread+0xae/0xe0 ? flushtlbkernelrange+0x90/0x90 retfromfork+0x2f/0x40 ? flushtlbkernelrange+0x90/0x90 retfromfork_asm+0x11/0x20 </TASK>
This occurs because cleanuponetransaction() calls destroydelallocinodes() which calls invalidateinodepages2() which takes the foliolock before setting mapping to NULL. We fail to check this, and subsequently call setextent_mapping(), which assumes that mapping != NULL (in fact it asserts that in debug mode)
Note that the "fixes" patch here is not the one that introduced the race (the very first iteration of this code from 2009) but a more recent change that made this particular crash happen in practice..(CVE-2024-56758)
In the Linux kernel, the following vulnerability has been resolved:
media: dvb-frontends: dib3000mb: fix uninit-value in dib3000writereg
Syzbot reports [1] an uninitialized value issue found by KMSAN in dib3000readreg().
Local u8 rb[2] is used in i2c_transfer() as a read buffer; in case that call fails, the buffer may end up with some undefined values.
Since no elaborate error handling is expected in dib3000writereg(), simply zero out rb buffer to mitigate the problem.
[1] Syzkaller report
BUG: KMSAN: uninit-value in dib3000mbattach+0x2d8/0x3c0 drivers/media/dvb-frontends/dib3000mb.c:758 dib3000mbattach+0x2d8/0x3c0 drivers/media/dvb-frontends/dib3000mb.c:758 dibusbdib3000mbfrontendattach+0x155/0x2f0 drivers/media/usb/dvb-usb/dibusb-mb.c:31 dvbusbadapterfrontendinit+0xed/0x9a0 drivers/media/usb/dvb-usb/dvb-usb-dvb.c:290 dvbusbadapterinit drivers/media/usb/dvb-usb/dvb-usb-init.c:90 [inline] dvbusbinit drivers/media/usb/dvb-usb/dvb-usb-init.c:186 [inline] dvbusbdeviceinit+0x25a8/0x3760 drivers/media/usb/dvb-usb/dvb-usb-init.c:310 dibusbprobe+0x46/0x250 drivers/media/usb/dvb-usb/dibusb-mb.c:110 ... Local variable rb created at: dib3000readreg+0x86/0x4e0 drivers/media/dvb-frontends/dib3000mb.c:54 dib3000mb_attach+0x123/0x3c0 drivers/media/dvb-frontends/dib3000mb.c:758 ...(CVE-2024-56769)
In the Linux kernel, the following vulnerability has been resolved:
nfsd: fix nfs4openowner leak when concurrent nfsd4open occur
The action force umount(umount -f) will attempt to kill all rpctask even umount operation may ultimately fail if some files remain open. Consequently, if an action attempts to open a file, it can potentially send two rpctask to nfs server.
NFS CLIENT
thread1 thread2 open("file") ... nfs4doopen nfs4doopen _nfs4openandgetstate _nfs4procopen nfs4runopentask /* rpctask1 */ rpcruntask rpcwaitforcompletion_task
umount -f
nfs_umount_begin
rpc_killall_tasks
rpc_signal_task
rpc_task1 been wakeup
and return -512
nfs4doopen // while loop ... nfs4runopentask /* rpctask2 */ rpcruntask rpcwaitforcompletion_task
While processing an open request, nfsd will first attempt to find or allocate an nfs4openowner. If it finds an nfs4openowner that is not marked as NFS4OOCONFIRMED, this nfs4openowner will released. Since two rpctask can attempt to open the same file simultaneously from the client to server, and because two instances of nfsd can run concurrently, this situation can lead to lots of memory leak. Additionally, when we echo 0 to /proc/fs/nfsd/threads, warning will be triggered.
NFS SERVER
nfsd1 nfsd2 echo 0 > /proc/fs/nfsd/threads
nfsd4open nfsd4processopen1 findorallocopenstateowner // alloc oo1, stateid1 nfsd4open nfsd4processopen1 findorallocopenstateowner // find oo1, without NFS4OOCONFIRMED releaseopenowner unhashopenownerlocked listdelinit(&oo->ooperclient) // cannot find this oo // from client, LEAK!!! alloc_stateowner // alloc oo2
nfsd4processopen2 initopenstateid // associate oo1 // with stateid1, stateid1 LEAK!!! nfs4getvfsfile // alloc nfsdfile1 and nfsdfilemark1 // all LEAK!!!
nfsd4_process_open2
...
write_threads
...
nfsd_destroy_serv
nfsd_shutdown_net
nfs4_state_shutdown_net
nfs4_state_destroy_net
destroy_client
__destroy_client
// won't find oo1!!!
nfsd_shutdown_generic
nfsd_file_cache_shutdown
kmem_cache_destroy
for nfsd_file_slab
and nfsd_file_mark_slab
// bark since nfsd_file1
// and nfsd_file_mark1
// still alive
======================================================================= BUG nfsdfile (Not tainted): Objects remaining in nfsdfile on
Slab 0xffd4000004438a80 objects=34 used=1 fp=0xff11000110e2ad28 flags=0x17ffffc0000240(workingset|head|node=0|zone=2|lastcpupid=0x1fffff) CPU: 4 UID: 0 PID: 757 Comm: sh Not tainted 6.12.0-rc6+ #19 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/01/2014 Call Trace: <TASK> dum ---truncated---(CVE-2024-56779)
In the Linux kernel, the following vulnerability has been resolved:
PCI: imx6: Fix suspend/resume support on i.MX6QDL
The suspend/resume functionality is currently broken on the i.MX6QDL platform, as documented in the NXP errata (ERR005723):
https://www.nxp.com/docs/en/errata/IMX6DQCE.pdf
This patch addresses the issue by sharing most of the suspend/resume sequences used by other i.MX devices, while avoiding modifications to critical registers that disrupt the PCIe functionality. It targets the same problem as the following downstream commit:
https://github.com/nxp-imx/linux-imx/commit/4e92355e1f79d225ea842511fcfd42b343b32995
Unlike the downstream commit, this patch also resets the connected PCIe device if possible. Without this reset, certain drivers, such as ath10k or iwlwifi, will crash on resume. The device reset is also done by the driver on other i.MX platforms, making this patch consistent with existing practices.
Upon resuming, the kernel will hang and display an error. Here's an example of the error encountered with the ath10k driver:
ath10k_pci 0000:01:00.0: Unable to change power state from D3hot to D0, device inaccessible Unhandled fault: imprecise external abort (0x1406) at 0x0106f944
Without this patch, suspend/resume will fail on i.MX6QDL devices if a PCIe device is connected.
kwilczynski: commit log, added tag for stable releases
In the Linux kernel, the following vulnerability has been resolved:
arm64: ptrace: fix partial SETREGSET for NTARMTAGGEDADDRCTRL
Currently taggedaddrctrlset() doesn't initialize the temporary 'ctrl' variable, and a SETREGSET call with a length of zero will leave this uninitialized. Consequently taggedaddrctrlset() will consume an arbitrary value, potentially leaking up to 64 bits of memory from the kernel stack. The read is limited to a specific slot on the stack, and the issue does not provide a write mechanism.
As settaggedaddr_ctrl() only accepts values where bits [63:4] zero and rejects other values, a partial SETREGSET attempt will randomly succeed or fail depending on the value of the uninitialized value, and the exposure is significantly limited.
Fix this by initializing the temporary value before copying the regset from userspace, as for other regsets (e.g. NTPRSTATUS, NTPRFPREG, NTARMSYSTEM_CALL). In the case of a zero-length write, the existing value of the tagged address ctrl will be retained.
The NTARMTAGGEDADDRCTRL regset is only visible in the useraarch64view used by a native AArch64 task to manipulate another native AArch64 task. As gettaggedaddrctrl() only returns an error value when called for a compat task, taggedaddrctrlget() and taggedaddrctrlset() should never observe an error value from gettaggedaddrctrl(). Add a WARNONONCE() to both to indicate that such an error would be unexpected, and error handlnig is not missing in either case.(CVE-2024-57874)
In the Linux kernel, the following vulnerability has been resolved:
ocfs2: fix slab-use-after-free due to dangling pointer dqi_priv
When mounting ocfs2 and then remounting it as read-only, a slab-use-after-free occurs after the user uses a syscall to quotagetnextquota. Specifically, sbdqinfo(sb, type)->dqi_priv is the dangling pointer.
During the remounting process, the pointer dqipriv is freed but is never set as null leaving it to be accessed. Additionally, the read-only option for remounting sets the DQUOTSUSPENDED flag instead of setting the DQUOTUSAGEENABLED flags. Moreover, later in the process of getting the next quota, the function ocfs2getnext_id is called and only checks the quota usage flags and not the quota suspended flags.
To fix this, I set dqipriv to null when it is freed after remounting with read-only and put a check for DQUOTSUSPENDED in ocfs2getnext_id.
akpm@linux-foundation.org: coding-style cleanups
In the Linux kernel, the following vulnerability has been resolved:
iio: adc: ti-ads8688: fix information leak in triggered buffer
The 'buffer' local array is used to push data to user space from a triggered buffer, but it does not set values for inactive channels, as it only uses iioforeachactivechannel() to assign new values.
Initialize the array to zero before using it to avoid pushing uninitialized information to userspace.(CVE-2024-57906)
In the Linux kernel, the following vulnerability has been resolved:
iio: light: vcnl4035: fix information leak in triggered buffer
The 'buffer' local array is used to push data to userspace from a triggered buffer, but it does not set an initial value for the single data element, which is an u16 aligned to 8 bytes. That leaves at least 4 bytes uninitialized even after writing an integer value with regmap_read().
Initialize the array to zero before using it to avoid pushing uninitialized information to userspace.(CVE-2024-57910)
In the Linux kernel, the following vulnerability has been resolved:
topology: Keep the cpumask unchanged when printing cpumap
During fuzz testing, the following warning was discovered:
different return values (15 and 11) from vsnprintf("%*pbl ", ...)
test:keyward is WARNING in kvasprintf WARNING: CPU: 55 PID: 1168477 at lib/kasprintf.c:30 kvasprintf+0x121/0x130 Call Trace: kvasprintf+0x121/0x130 kasprintf+0xa6/0xe0 bitmapprinttobuf+0x89/0x100 coresiblingslistread+0x7e/0xb0 kernfsfilereaditer+0x15b/0x270 newsyncread+0x153/0x260 vfsread+0x215/0x290 ksysread+0xb9/0x160 dosyscall64+0x56/0x100 entrySYSCALL64after_hwframe+0x78/0xe2
The call trace shows that kvasprintf() reported this warning during the printing of coresiblingslist. kvasprintf() has several steps:
(1) First, calculate the length of the resulting formatted string.
(2) Allocate a buffer based on the returned length.
(3) Then, perform the actual string formatting.
(4) Check whether the lengths of the formatted strings returned in steps (1) and (2) are consistent.
If the corecpumask is modified between steps (1) and (3), the lengths obtained in these two steps may not match. Indeed our test includes cpu hotplugging, which should modify corecpumask while printing.
To fix this issue, cache the cpumask into a temporary variable before calling cpumapprint{list, cpumask}tobuf(), to keep it unchanged during the printing process.(CVE-2024-57917)
In the Linux kernel, the following vulnerability has been resolved:
drm/amd/display: Add check for granularity in dml ceil/floor helpers
[Why] Wrapper functions for dcnbwceil2() and dcnbwfloor2() should check for granularity is non zero to avoid assert and divide-by-zero error in dcnbw functions.
[How] Add check for granularity 0.
(cherry picked from commit f6e09701c3eb2ccb8cb0518e0b67f1c69742a4ec)(CVE-2024-57922)
In the Linux kernel, the following vulnerability has been resolved:
drm/mediatek: Set private->alldrmprivate[i]->drm to NULL if mtkdrmbind returns err
The pointer need to be set to NULL, otherwise KASAN complains about use-after-free. Because in mtkdrmbind, all private's drm are set as follows.
private->alldrmprivate[i]->drm = drm;
And drm will be released by drmdevput in case mtkdrmkmsinit returns failure. However, the shutdown path still accesses the previous allocated memory in drmatomichelpershutdown.
[ 84.874820] watchdog: watchdog0: watchdog did not stop! [ 86.512054] ================================================================== [ 86.513162] BUG: KASAN: use-after-free in drmatomichelpershutdown+0x33c/0x378 [ 86.514258] Read of size 8 at addr ffff0000d46fc068 by task shutdown/1 [ 86.515213] [ 86.515455] CPU: 1 UID: 0 PID: 1 Comm: shutdown Not tainted 6.13.0-rc1-mtk+gfa1a78e5d24b-dirty #55 [ 86.516752] Hardware name: Unknown Product/Unknown Product, BIOS 2022.10 10/01/2022 [ 86.517960] Call trace: [ 86.518333] showstack+0x20/0x38 (C) [ 86.518891] dumpstacklvl+0x90/0xd0 [ 86.519443] printreport+0xf8/0x5b0 [ 86.519985] kasanreport+0xb4/0x100 [ 86.520526] _asanreportload8noabort+0x20/0x30 [ 86.521240] drmatomichelpershutdown+0x33c/0x378 [ 86.521966] mtkdrmshutdown+0x54/0x80 [ 86.522546] platformshutdown+0x64/0x90 [ 86.523137] deviceshutdown+0x260/0x5b8 [ 86.523728] kernelrestart+0x78/0xf0 [ 86.524282] _dosysreboot+0x258/0x2f0 [ 86.524871] _arm64sysreboot+0x90/0xd8 [ 86.525473] invokesyscall+0x74/0x268 [ 86.526041] el0svccommon.constprop.0+0xb0/0x240 [ 86.526751] doel0svc+0x4c/0x70 [ 86.527251] el0svc+0x4c/0xc0 [ 86.527719] el0t64synchandler+0x144/0x168 [ 86.528367] el0t64_sync+0x198/0x1a0 [ 86.528920] [ 86.529157] The buggy address belongs to the physical page: [ 86.529972] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0xffff0000d46fd4d0 pfn:0x1146fc [ 86.531319] flags: 0xbfffc0000000000(node=0|zone=2|lastcpupid=0xffff) [ 86.532267] raw: 0bfffc0000000000 0000000000000000 dead000000000122 0000000000000000 [ 86.533390] raw: ffff0000d46fd4d0 0000000000000000 00000000ffffffff 0000000000000000 [ 86.534511] page dumped because: kasan: bad access detected [ 86.535323] [ 86.535559] Memory state around the buggy address: [ 86.536265] ffff0000d46fbf00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 86.537314] ffff0000d46fbf80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 86.538363] >ffff0000d46fc000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 86.544733] ^ [ 86.551057] ffff0000d46fc080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 86.557510] ffff0000d46fc100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 86.563928] ================================================================== [ 86.571093] Disabling lock debugging due to kernel taint [ 86.577642] Unable to handle kernel paging request at virtual address e0e9c0920000000b [ 86.581834] KASAN: maybe wild-memory-access in range [0x0752049000000058-0x075204900000005f] ...(CVE-2024-57926)
In the Linux kernel, the following vulnerability has been resolved:
x86/fpu: Ensure shadow stack is active before "getting" registers
The x86 shadow stack support has its own set of registers. Those registers are XSAVE-managed, but they are "supervisor state components" which means that userspace can not touch them with XSAVE/XRSTOR. It also means that they are not accessible from the existing ptrace ABI for XSAVE state. Thus, there is a new ptrace get/set interface for it.
The regset code that ptrace uses provides an ->active() handler in addition to the get/set ones. For shadow stack this ->active() handler verifies that shadow stack is enabled via the ARCHSHSTKSHSTK bit in the thread struct. The ->active() handler is checked from some call sites of the regset get/set handlers, but not the ptrace ones. This was not understood when shadow stack support was put in place.
As a result, both the set/get handlers can be called with XFEATURECETUSER in its init state, which would cause getxsaveaddr() to return NULL and trigger a WARNON(). The sspset() handler luckily has an sspactive() check to avoid surprising the kernel with shadow stack behavior when the kernel is not ready for it (ARCHSHSTK_SHSTK==0). That check just happened to avoid the warning.
But the ->get() side wasn't so lucky. It can be called with shadow stacks disabled, triggering the warning in practice, as reported by Christina Schimpe:
WARNING: CPU: 5 PID: 1773 at arch/x86/kernel/fpu/regset.c:198 sspget+0x89/0xa0 [...] Call Trace: <TASK> ? showregs+0x6e/0x80 ? sspget+0x89/0xa0 ? _warn+0x91/0x150 ? sspget+0x89/0xa0 ? reportbug+0x19d/0x1b0 ? handlebug+0x46/0x80 ? excinvalidop+0x1d/0x80 ? asmexcinvalidop+0x1f/0x30 ? _pfxsspget+0x10/0x10 ? sspget+0x89/0xa0 ? sspget+0x52/0xa0 _regsetget+0xad/0xf0 copyregsettouser+0x52/0xc0 ptraceregset+0x119/0x140 ptracerequest+0x13c/0x850 ? waittaskinactive+0x142/0x1d0 ? dosyscall64+0x6d/0x90 arch_ptrace+0x102/0x300 [...]
Ensure that shadow stacks are active in a thread before looking them up in the XSAVE buffer. Since ARCHSHSTKSHSTK and userssp[SHSTKEN] are set at the same time, the active check ensures that there will be something to find in the XSAVE buffer.
dhansen: changelog/subject tweaks
In the Linux kernel, the following vulnerability has been resolved:
btrfs: avoid NULL pointer dereference if no valid extent tree
[BUG] Syzbot reported a crash with the following call trace:
BTRFS info (device loop0): scrub: started on devid 1 BUG: kernel NULL pointer dereference, address: 0000000000000208 #PF: supervisor read access in kernel mode #PF: errorcode(0x0000) - not-present page PGD 106e70067 P4D 106e70067 PUD 107143067 PMD 0 Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 1 UID: 0 PID: 689 Comm: repro Kdump: loaded Tainted: G O 6.13.0-rc4-custom+ #206 Tainted: [O]=OOTMODULE Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS unknown 02/02/2022 RIP: 0010:findfirstextentitem+0x26/0x1f0 [btrfs] Call Trace: <TASK> scrubfindfillfirststripe+0x13d/0x3b0 [btrfs] scrubsimplemirror+0x175/0x260 [btrfs] scrubstripe+0x5d4/0x6c0 [btrfs] scrubchunk+0xbb/0x170 [btrfs] scrubenumeratechunks+0x2f4/0x5f0 [btrfs] btrfsscrubdev+0x240/0x600 [btrfs] btrfsioctl+0x1dc8/0x2fa0 [btrfs] ? dosysopenat2+0xa5/0xf0 _x64sysioctl+0x97/0xc0 dosyscall64+0x4f/0x120 entrySYSCALL64after_hwframe+0x76/0x7e </TASK>
[CAUSE] The reproducer is using a corrupted image where extent tree root is corrupted, thus forcing to use "rescue=all,ro" mount option to mount the image.
Then it triggered a scrub, but since scrub relies on extent tree to find where the data/metadata extents are, scrubfindfillfirststripe() relies on an non-empty extent root.
But unfortunately scrubfindfillfirststripe() doesn't really expect an NULL pointer for extent root, it use extentroot to grab fsinfo and triggered a NULL pointer dereference.
[FIX] Add an extra check for a valid extent root at the beginning of scrubfindfillfirststripe().
The new error path is introduced by 42437a6386ff ("btrfs: introduce mount option rescue=ignorebadroots"), but that's pretty old, and later commit b979547513ff ("btrfs: scrub: introduce helper to find and fill sector info for a scrub_stripe") changed how we do scrub.
So for kernels older than 6.6, the fix will need manual backport.(CVE-2025-21658)
In the Linux kernel, the following vulnerability has been resolved:
vsock/bpf: return early if transport is not assigned
Some of the core functions can only be called if the transport has been assigned.
As Michal reported, a socket might have the transport at NULL, for example after a failed connect(), causing the following trace:
BUG: kernel NULL pointer dereference, address: 00000000000000a0
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 12faf8067 P4D 12faf8067 PUD 113670067 PMD 0
Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 15 UID: 0 PID: 1198 Comm: a.out Not tainted 6.13.0-rc2+
RIP: 0010:vsock_connectible_has_data+0x1f/0x40
Call Trace:
vsock_bpf_recvmsg+0xca/0x5e0
sock_recvmsg+0xb9/0xc0
__sys_recvfrom+0xb3/0x130
__x64_sys_recvfrom+0x20/0x30
do_syscall_64+0x93/0x180
entry_SYSCALL_64_after_hwframe+0x76/0x7e
So we need to check the vsk->transport
in vsockbpfrecvmsg(),
especially for connected sockets (stream/seqpacket) as we already
do in _vsockconnectible_recvmsg().(CVE-2025-21670)
{ "severity": "High" }
{ "x86_64": [ "bpftool-6.6.0-76.0.0.69.oe2403.x86_64.rpm", "bpftool-debuginfo-6.6.0-76.0.0.69.oe2403.x86_64.rpm", "kernel-6.6.0-76.0.0.69.oe2403.x86_64.rpm", "kernel-debuginfo-6.6.0-76.0.0.69.oe2403.x86_64.rpm", "kernel-debugsource-6.6.0-76.0.0.69.oe2403.x86_64.rpm", "kernel-devel-6.6.0-76.0.0.69.oe2403.x86_64.rpm", "kernel-headers-6.6.0-76.0.0.69.oe2403.x86_64.rpm", "kernel-source-6.6.0-76.0.0.69.oe2403.x86_64.rpm", "kernel-tools-6.6.0-76.0.0.69.oe2403.x86_64.rpm", "kernel-tools-debuginfo-6.6.0-76.0.0.69.oe2403.x86_64.rpm", "kernel-tools-devel-6.6.0-76.0.0.69.oe2403.x86_64.rpm", "perf-6.6.0-76.0.0.69.oe2403.x86_64.rpm", "perf-debuginfo-6.6.0-76.0.0.69.oe2403.x86_64.rpm", "python3-perf-6.6.0-76.0.0.69.oe2403.x86_64.rpm", "python3-perf-debuginfo-6.6.0-76.0.0.69.oe2403.x86_64.rpm" ], "src": [ "kernel-6.6.0-76.0.0.69.oe2403.src.rpm" ], "aarch64": [ "bpftool-6.6.0-76.0.0.69.oe2403.aarch64.rpm", "bpftool-debuginfo-6.6.0-76.0.0.69.oe2403.aarch64.rpm", "kernel-6.6.0-76.0.0.69.oe2403.aarch64.rpm", "kernel-debuginfo-6.6.0-76.0.0.69.oe2403.aarch64.rpm", "kernel-debugsource-6.6.0-76.0.0.69.oe2403.aarch64.rpm", "kernel-devel-6.6.0-76.0.0.69.oe2403.aarch64.rpm", "kernel-headers-6.6.0-76.0.0.69.oe2403.aarch64.rpm", "kernel-source-6.6.0-76.0.0.69.oe2403.aarch64.rpm", "kernel-tools-6.6.0-76.0.0.69.oe2403.aarch64.rpm", "kernel-tools-debuginfo-6.6.0-76.0.0.69.oe2403.aarch64.rpm", "kernel-tools-devel-6.6.0-76.0.0.69.oe2403.aarch64.rpm", "perf-6.6.0-76.0.0.69.oe2403.aarch64.rpm", "perf-debuginfo-6.6.0-76.0.0.69.oe2403.aarch64.rpm", "python3-perf-6.6.0-76.0.0.69.oe2403.aarch64.rpm", "python3-perf-debuginfo-6.6.0-76.0.0.69.oe2403.aarch64.rpm" ] }