In the Linux kernel, the following vulnerability has been resolved:
mm/hugetlb: fix DEBUGLOCKSWARNON(1) when dissolvefreehugetlbfolio()
When I did memory failure tests recently, below warning occurs:
DEBUGLOCKSWARNON(1) WARNING: CPU: 8 PID: 1011 at kernel/locking/lockdep.c:232 lockacquire+0xccb/0x1ca0 Modules linked in: mceinject hwpoisoninject CPU: 8 PID: 1011 Comm: bash Kdump: loaded Not tainted 6.9.0-rc3-next-20240410-00012-gdb69f219f4be #3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 RIP: 0010:lockacquire+0xccb/0x1ca0 RSP: 0018:ffffa7a1c7fe3bd0 EFLAGS: 00000082 RAX: 0000000000000000 RBX: eb851eb853975fcf RCX: ffffa1ce5fc1c9c8 RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffffa1ce5fc1c9c0 RBP: ffffa1c6865d3280 R08: ffffffffb0f570a8 R09: 0000000000009ffb R10: 0000000000000286 R11: ffffffffb0f2ad50 R12: ffffa1c6865d3d10 R13: ffffa1c6865d3c70 R14: 0000000000000000 R15: 0000000000000004 FS: 00007ff9f32aa740(0000) GS:ffffa1ce5fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ff9f3134ba0 CR3: 00000008484e4000 CR4: 00000000000006f0 Call Trace: <TASK> lockacquire+0xbe/0x2d0 rawspinlockirqsave+0x3a/0x60 hugepagesubpoolputpages.part.0+0xe/0xc0 freehugefolio+0x253/0x3f0 dissolvefreehugepage+0x147/0x210 pagehandlepoison+0x9/0x70 memoryfailure+0x4e6/0x8c0 hardofflinepagestore+0x55/0xa0 kernfsfopwriteiter+0x12c/0x1d0 vfswrite+0x380/0x540 ksyswrite+0x64/0xe0 dosyscall64+0xbc/0x1d0 entrySYSCALL64afterhwframe+0x77/0x7f RIP: 0033:0x7ff9f3114887 RSP: 002b:00007ffecbacb458 EFLAGS: 00000246 ORIGRAX: 0000000000000001 RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007ff9f3114887 RDX: 000000000000000c RSI: 0000564494164e10 RDI: 0000000000000001 RBP: 0000564494164e10 R08: 00007ff9f31d1460 R09: 000000007fffffff R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000000c R13: 00007ff9f321b780 R14: 00007ff9f3217600 R15: 00007ff9f3216a00 </TASK> Kernel panic - not syncing: kernel: paniconwarn set ... CPU: 8 PID: 1011 Comm: bash Kdump: loaded Not tainted 6.9.0-rc3-next-20240410-00012-gdb69f219f4be #3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> panic+0x326/0x350 checkpaniconwarn+0x4f/0x50 _warn+0x98/0x190 reportbug+0x18e/0x1a0 handlebug+0x3d/0x70 excinvalidop+0x18/0x70 asmexcinvalidop+0x1a/0x20 RIP: 0010:lockacquire+0xccb/0x1ca0 RSP: 0018:ffffa7a1c7fe3bd0 EFLAGS: 00000082 RAX: 0000000000000000 RBX: eb851eb853975fcf RCX: ffffa1ce5fc1c9c8 RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffffa1ce5fc1c9c0 RBP: ffffa1c6865d3280 R08: ffffffffb0f570a8 R09: 0000000000009ffb R10: 0000000000000286 R11: ffffffffb0f2ad50 R12: ffffa1c6865d3d10 R13: ffffa1c6865d3c70 R14: 0000000000000000 R15: 0000000000000004 lockacquire+0xbe/0x2d0 _rawspinlockirqsave+0x3a/0x60 hugepagesubpoolputpages.part.0+0xe/0xc0 freehugefolio+0x253/0x3f0 dissolvefreehugepage+0x147/0x210 _pagehandlepoison+0x9/0x70 memoryfailure+0x4e6/0x8c0 hardofflinepagestore+0x55/0xa0 kernfsfopwriteiter+0x12c/0x1d0 vfswrite+0x380/0x540 ksyswrite+0x64/0xe0 dosyscall64+0xbc/0x1d0 entrySYSCALL64afterhwframe+0x77/0x7f RIP: 0033:0x7ff9f3114887 RSP: 002b:00007ffecbacb458 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007ff9f3114887 RDX: 000000000000000c RSI: 0000564494164e10 RDI: 0000000000000001 RBP: 0000564494164e10 R08: 00007ff9f31d1460 R09: 000000007fffffff R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000000c R13: 00007ff9f321b780 R14: 00007ff9f3217600 R15: 00007ff9f3216a00 </TASK>
After git bisecting and digging into the code, I believe the root cause is that deferredlist field of folio is unioned with hugetlbsubpool field. In _updateandfreehugetlbfolio(), folio->deferred_ ---truncated---
{ "vanir_signatures": [ { "signature_type": "Function", "target": { "file": "mm/hugetlb.c", "function": "__update_and_free_hugetlb_folio" }, "signature_version": "v1", "digest": { "length": 712.0, "function_hash": "333025752880396055842890017321120024593" }, "deprecated": false, "source": "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git@52ccdde16b6540abe43b6f8d8e1e1ec90b0983af", "id": "CVE-2024-36028-53391516" }, { "signature_type": "Line", "target": { "file": "mm/hugetlb.c" }, "signature_version": "v1", "digest": { "threshold": 0.9, "line_hashes": [ "146018048214871711061162910528092636583", "6607988696898792625685197412827385831", "89300256452544786786881335850661562399", "260675871965569803042309147070522461305" ] }, "deprecated": false, "source": "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git@52ccdde16b6540abe43b6f8d8e1e1ec90b0983af", "id": "CVE-2024-36028-622b385e" }, { "signature_type": "Function", "target": { "file": "mm/hugetlb.c", "function": "__update_and_free_page" }, "signature_version": "v1", "digest": { "length": 957.0, "function_hash": "108603060366271541734070848448006371208" }, "deprecated": false, "source": "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git@2effe407f7563add41750fd7e03da4ea44b98099", "id": "CVE-2024-36028-6fcecc6f" }, { "signature_type": "Line", "target": { "file": "mm/hugetlb.c" }, "signature_version": "v1", "digest": { "threshold": 0.9, "line_hashes": [ "325441486145398697534790675620706498282", "67754028127153476470690840261070525068", "233667196348677319187630982973628860206", "215026488886735010636164881449839862839", "146018048214871711061162910528092636583", "6607988696898792625685197412827385831", "89300256452544786786881335850661562399", "260675871965569803042309147070522461305" ] }, "deprecated": false, "source": "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git@7e0a322877416e8c648819a8e441cf8c790b2cce", "id": "CVE-2024-36028-83f6b7ff" }, { "signature_type": "Line", "target": { "file": "mm/hugetlb.c" }, "signature_version": "v1", "digest": { "threshold": 0.9, "line_hashes": [ "107310032164695039228799023936850195519", "325807769359430528168241852701049719784", "154122901995912813967294461790983492187", "67632216492141358830718313462474466296", "287505325476905360547226762678386779385", "141888473021907286922153021787905640669", "110298229464436256543570345242582717788", "8319857141546755278024379513635229560" ] }, "deprecated": false, "source": "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git@2effe407f7563add41750fd7e03da4ea44b98099", "id": "CVE-2024-36028-8a39c199" }, { "signature_type": "Function", "target": { "file": "mm/hugetlb.c", "function": "__update_and_free_hugetlb_folio" }, "signature_version": "v1", "digest": { "length": 712.0, "function_hash": "333025752880396055842890017321120024593" }, "deprecated": false, "source": "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git@9c9b32d46afab2d911897914181c488954012300", "id": "CVE-2024-36028-b900f9c2" }, { "signature_type": "Function", "target": { "file": "mm/hugetlb.c", "function": "__update_and_free_hugetlb_folio" }, "signature_version": "v1", "digest": { "length": 715.0, "function_hash": "55237077046568452668053090346120615537" }, "deprecated": false, "source": "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git@7e0a322877416e8c648819a8e441cf8c790b2cce", "id": "CVE-2024-36028-d6b84ecc" }, { "signature_type": "Line", "target": { "file": "mm/hugetlb.c" }, "signature_version": "v1", "digest": { "threshold": 0.9, "line_hashes": [ "146018048214871711061162910528092636583", "6607988696898792625685197412827385831", "89300256452544786786881335850661562399", "260675871965569803042309147070522461305" ] }, "deprecated": false, "source": "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git@9c9b32d46afab2d911897914181c488954012300", "id": "CVE-2024-36028-fcf64f09" } ] }