In the Linux kernel, the following vulnerability has been resolved:
KVM: SVM: Reject SEV{-ES} intra host migration if vCPU creation is in-flight
Reject migration of SEV{-ES} state if either the source or destination VM is actively creating a vCPU, i.e. if kvmvmioctlcreatevcpu() is in the section between incrementing createdvcpus and onlinevcpus. The bulk of vCPU creation runs outside of kvm->lock to allow creating multiple vCPUs in parallel, and so sevinfo.esactive can get toggled from false=>true in the destination VM after (or during) svmvcpucreate(), resulting in an SEV{-ES} VM effectively having a non-SEV{-ES} vCPU.
The issue manifests most visibly as a crash when trying to free a vCPU's NULL VMSA page in an SEV-ES VM, but any number of things can go wrong.
BUG: unable to handle page fault for address: ffffebde00000000 #PF: supervisor read access in kernel mode #PF: errorcode(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP KASAN NOPTI CPU: 227 UID: 0 PID: 64063 Comm: syz.5.60023 Tainted: G U O 6.15.0-smp-DEV #2 NONE Tainted: [U]=USER, [O]=OOTMODULE Hardware name: Google, Inc. ArcadiaIT80/ArcadiaIT80, BIOS 12.52.0-0 10/28/2024 RIP: 0010:constanttestbit arch/x86/include/asm/bitops.h:206 [inline] RIP: 0010:archtestbit arch/x86/include/asm/bitops.h:238 [inline] RIP: 0010:testbit include/asm-generic/bitops/instrumented-non-atomic.h:142 [inline] RIP: 0010:PageHead include/linux/page-flags.h:866 [inline] RIP: 0010:_freepages+0x3e/0x120 mm/pagealloc.c:5067 Code: <49> f7 06 40 00 00 00 75 05 45 31 ff eb 0c 66 90 4c 89 f0 4c 39 f0 RSP: 0018:ffff8984551978d0 EFLAGS: 00010246 RAX: 0000777f80000001 RBX: 0000000000000000 RCX: ffffffff918aeb98 RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffffebde00000000 RBP: 0000000000000000 R08: ffffebde00000007 R09: 1ffffd7bc0000000 R10: dffffc0000000000 R11: fffff97bc0000001 R12: dffffc0000000000 R13: ffff8983e19751a8 R14: ffffebde00000000 R15: 1ffffd7bc0000000 FS: 0000000000000000(0000) GS:ffff89ee661d3000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffebde00000000 CR3: 000000793ceaa000 CR4: 0000000000350ef0 DR0: 0000000000000000 DR1: 0000000000000b5f DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Call Trace: <TASK> sevfreevcpu+0x413/0x630 arch/x86/kvm/svm/sev.c:3169 svmvcpufree+0x13a/0x2a0 arch/x86/kvm/svm/svm.c:1515 kvmarchvcpudestroy+0x6a/0x1d0 arch/x86/kvm/x86.c:12396 kvmvcpudestroy virt/kvm/kvmmain.c:470 [inline] kvmdestroyvcpus+0xd1/0x300 virt/kvm/kvmmain.c:490 kvmarchdestroyvm+0x636/0x820 arch/x86/kvm/x86.c:12895 kvmputkvm+0xb8e/0xfb0 virt/kvm/kvmmain.c:1310 kvmvmrelease+0x48/0x60 virt/kvm/kvmmain.c:1369 _fput+0x3e4/0x9e0 fs/filetable.c:465 taskworkrun+0x1a9/0x220 kernel/taskwork.c:227 exittaskwork include/linux/taskwork.h:40 [inline] doexit+0x7f0/0x25b0 kernel/exit.c:953 dogroupexit+0x203/0x2d0 kernel/exit.c:1102 getsignal+0x1357/0x1480 kernel/signal.c:3034 archdosignalorrestart+0x40/0x690 arch/x86/kernel/signal.c:337 exittousermodeloop kernel/entry/common.c:111 [inline] exittousermodeprepare include/linux/entry-common.h:329 [inline] _syscallexittousermodework kernel/entry/common.c:207 [inline] syscallexittousermode+0x67/0xb0 kernel/entry/common.c:218 dosyscall64+0x7c/0x150 arch/x86/entry/syscall64.c:100 entrySYSCALL64after_hwframe+0x76/0x7e RIP: 0033:0x7f87a898e969 </TASK> Modules linked in: gq(O) gsmi: Log Shutdown Reason 0x03 CR2: ffffebde00000000 ---[ end trace 0000000000000000 ]---
Deliberately don't check for a NULL VMSA when freeing the vCPU, as crashing the host is likely desirable due to the VMSA being consumed by hardware. E.g. if KVM manages to allow VMRUN on the vCPU, hardware may read/write a bogus VMSA page. Accessing P ---truncated---