CVE-2024-56559

Source
https://nvd.nist.gov/vuln/detail/CVE-2024-56559
Import Source
https://storage.googleapis.com/osv-test-cve-osv-conversion/osv-output/CVE-2024-56559.json
JSON Data
https://api.test.osv.dev/v1/vulns/CVE-2024-56559
Related
Published
2024-12-27T15:15:14Z
Modified
2025-01-08T09:58:03.004029Z
Summary
[none]
Details

In the Linux kernel, the following vulnerability has been resolved:

mm/vmalloc: combine all TLB flush operations of KASAN shadow virtual address into one operation

When compiling kernel source 'make -j $(nproc)' with the up-and-running KASAN-enabled kernel on a 256-core machine, the following soft lockup is shown:

watchdog: BUG: soft lockup - CPU#28 stuck for 22s! [kworker/28:1:1760] CPU: 28 PID: 1760 Comm: kworker/28:1 Kdump: loaded Not tainted 6.10.0-rc5 #95 Workqueue: events drainvmapareawork RIP: 0010:smpcallfunctionmanycond+0x1d8/0xbb0 Code: 38 c8 7c 08 84 c9 0f 85 49 08 00 00 8b 45 08 a8 01 74 2e 48 89 f1 49 89 f7 48 c1 e9 03 41 83 e7 07 4c 01 e9 41 83 c7 03 f3 90 <0f> b6 01 41 38 c7 7c 08 84 c0 0f 85 d4 06 00 00 8b 45 08 a8 01 75 RSP: 0018:ffffc9000cb3fb60 EFLAGS: 00000202 RAX: 0000000000000011 RBX: ffff8883bc4469c0 RCX: ffffed10776e9949 RDX: 0000000000000002 RSI: ffff8883bb74ca48 RDI: ffffffff8434dc50 RBP: ffff8883bb74ca40 R08: ffff888103585dc0 R09: ffff8884533a1800 R10: 0000000000000004 R11: ffffffffffffffff R12: ffffed1077888d39 R13: dffffc0000000000 R14: ffffed1077888d38 R15: 0000000000000003 FS: 0000000000000000(0000) GS:ffff8883bc400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00005577b5c8d158 CR3: 0000000004850000 CR4: 0000000000350ef0 Call Trace: <IRQ> ? watchdogtimerfn+0x2cd/0x390 ? pfxwatchdogtimerfn+0x10/0x10 ? _hrtimerrunqueues+0x300/0x6d0 ? schedclockcpu+0x69/0x4e0 ? _pfxhrtimerrunqueues+0x10/0x10 ? srsoreturnthunk+0x5/0x5f ? ktimegetupdateoffsetsnow+0x7f/0x2a0 ? srsoreturnthunk+0x5/0x5f ? srsoreturnthunk+0x5/0x5f ? hrtimerinterrupt+0x2ca/0x760 ? _sysvecapictimerinterrupt+0x8c/0x2b0 ? sysvecapictimerinterrupt+0x6a/0x90 </IRQ> <TASK> ? asmsysvecapictimerinterrupt+0x16/0x20 ? smpcallfunctionmanycond+0x1d8/0xbb0 ? _pfxdokernelrangeflush+0x10/0x10 oneachcpucondmask+0x20/0x40 flushtlbkernelrange+0x19b/0x250 ? srsoreturnthunk+0x5/0x5f ? kasanreleasevmalloc+0xa7/0xc0 purgevmapnode+0x357/0x820 ? _pfxpurgevmapnode+0x10/0x10 _purgevmaparealazy+0x5b8/0xa10 drainvmapareawork+0x21/0x30 processonework+0x661/0x10b0 workerthread+0x844/0x10e0 ? srsoreturnthunk+0x5/0x5f ? _kthreadparkme+0x82/0x140 ? _pfxworkerthread+0x10/0x10 kthread+0x2a5/0x370 ? _pfxkthread+0x10/0x10 retfromfork+0x30/0x70 ? _pfxkthread+0x10/0x10 retfromfork_asm+0x1a/0x30 </TASK>

Debugging Analysis:

  1. The following ftrace log shows that the lockup CPU spends too much time iterating vmapnodes and flushing TLB when purging vmarea structures. (Some info is trimmed).

    kworker: funcgraphentry: | drainvmapareawork() { kworker: funcgraphentry: | mutexlock() { kworker: funcgraphentry: 1.092 us | _condresched(); kworker: funcgraphexit: 3.306 us | } ... ... kworker: funcgraphentry: | flushtlbkernelrange() { ... ... kworker: funcgraphexit: # 7533.649 us | } ... ... kworker: funcgraphentry: 2.344 us | mutexunlock(); kworker: funcgraphexit: $ 23871554 us | }

    The drainvmaparea_work() spends over 23 seconds.

    There are 2805 flushtlbkernel_range() calls in the ftrace log.

    • One is called in _purgevmaparealazy().
    • Others are called by purgevmapnode->kasanreleasevmalloc. purgevmapnode() iteratively releases kasan vmalloc allocations and flushes TLB for each vmaparea.
      • [Rough calculation] Each flushtlbkernelrange() runs about 7.5ms. -- 2804 * 7.5ms = 21.03 seconds. -- That's why a soft lock is triggered.
  2. Extending the soft lockup time can work around the issue (For example, # echo ---truncated---

References

Affected packages

Debian:13 / linux

Package

Name
linux
Purl
pkg:deb/debian/linux?arch=source

Affected ranges

Type
ECOSYSTEM
Events
Introduced
0Unknown introduced version / All previous versions are affected
Fixed
6.12.5-1

Affected versions

6.*

6.1.27-1
6.1.37-1
6.1.38-1
6.1.38-2~bpo11+1
6.1.38-2
6.1.38-3
6.1.38-4~bpo11+1
6.1.38-4
6.1.52-1
6.1.55-1~bpo11+1
6.1.55-1
6.1.64-1
6.1.66-1
6.1.67-1
6.1.69-1~bpo11+1
6.1.69-1
6.1.76-1~bpo11+1
6.1.76-1
6.1.82-1
6.1.85-1
6.1.90-1~bpo11+1
6.1.90-1
6.1.94-1~bpo11+1
6.1.94-1
6.1.98-1
6.1.99-1
6.1.106-1
6.1.106-2
6.1.106-3
6.1.112-1
6.1.115-1
6.1.119-1
6.1.123-1
6.3.1-1~exp1
6.3.2-1~exp1
6.3.4-1~exp1
6.3.5-1~exp1
6.3.7-1~bpo12+1
6.3.7-1
6.3.11-1
6.4~rc6-1~exp1
6.4~rc7-1~exp1
6.4.1-1~exp1
6.4.4-1~bpo12+1
6.4.4-1
6.4.4-2
6.4.4-3~bpo12+1
6.4.4-3
6.4.11-1
6.4.13-1
6.5~rc4-1~exp1
6.5~rc6-1~exp1
6.5~rc7-1~exp1
6.5.1-1~exp1
6.5.3-1~bpo12+1
6.5.3-1
6.5.6-1
6.5.8-1
6.5.10-1~bpo12+1
6.5.10-1
6.5.13-1
6.6.3-1~exp1
6.6.4-1~exp1
6.6.7-1~exp1
6.6.8-1
6.6.9-1
6.6.11-1
6.6.13-1~bpo12+1
6.6.13-1
6.6.15-1
6.6.15-2
6.7-1~exp1
6.7.1-1~exp1
6.7.4-1~exp1
6.7.7-1
6.7.9-1
6.7.9-2
6.7.12-1~bpo12+1
6.7.12-1
6.8.9-1
6.8.11-1
6.8.12-1~bpo12+1
6.8.12-1
6.9.2-1~exp1
6.9.7-1~bpo12+1
6.9.7-1
6.9.8-1
6.9.9-1
6.9.10-1~bpo12+1
6.9.10-1
6.9.11-1
6.9.12-1
6.10-1~exp1
6.10.1-1~exp1
6.10.3-1
6.10.4-1
6.10.6-1~bpo12+1
6.10.6-1
6.10.7-1
6.10.9-1
6.10.11-1~bpo12+1
6.10.11-1
6.10.12-1
6.11~rc4-1~exp1
6.11~rc5-1~exp1
6.11-1~exp1
6.11.2-1
6.11.4-1
6.11.5-1~bpo12+1
6.11.5-1
6.11.6-1
6.11.7-1
6.11.9-1
6.11.10-1~bpo12+1
6.11.10-1
6.12~rc6-1~exp1
6.12.3-1

Ecosystem specific

{
    "urgency": "not yet assigned"
}