CVE-2025-38520

Source
https://nvd.nist.gov/vuln/detail/CVE-2025-38520
Import Source
https://storage.googleapis.com/osv-test-cve-osv-conversion/osv-output/CVE-2025-38520.json
JSON Data
https://api.test.osv.dev/v1/vulns/CVE-2025-38520
Downstream
Related
Published
2025-08-16T11:15:45Z
Modified
2025-09-06T13:01:25Z
Summary
[none]
Details

In the Linux kernel, the following vulnerability has been resolved:

drm/amdkfd: Don't call mmput from MMU notifier callback

If the process is exiting, the mmput inside mmu notifier callback from compactd or fork or numa balancing could release the last reference of mm struct to call exitmmap and freepgtable, this triggers deadlock with below backtrace.

The deadlock will leak kfd process as mmu notifier release is not called and cause VRAM leaking.

The fix is to take mm reference mmgetnonzero when adding prange to the deferred list to pair with mmput in deferred list work.

If prange split and add into pchild list, the pchild workitem.mm is not used, so remove the mm parameter from svmrangeunmapsplit and svmrangeadd_child.

The backtrace of hung task:

INFO: task python:348105 blocked for more than 64512 seconds. Call Trace: _schedule+0x1c3/0x550 schedule+0x46/0xb0 rwsemdownwriteslowpath+0x24b/0x4c0 unlinkanonvmas+0xb1/0x1c0 freepgtables+0xa9/0x130 exitmmap+0xbc/0x1a0 mmput+0x5a/0x140 svmrangecpuinvalidatepagetables+0x2b/0x40 [amdgpu] mnitreeinvalidate+0x72/0xc0 _mmunotifierinvalidaterangestart+0x48/0x60 trytounmapone+0x10fa/0x1400 rmapwalkanon+0x196/0x460 trytounmap+0xbb/0x210 migratepageunmap+0x54d/0x7e0 migratepagesbatch+0x1c3/0xae0 migratepagessync+0x98/0x240 migratepages+0x25c/0x520 compactzone+0x29d/0x590 compactzoneorder+0xb6/0xf0 trytocompactpages+0xbe/0x220 _allocpagesdirectcompact+0x96/0x1a0 _allocpagesslowpath+0x410/0x930 _allocpagesnodemask+0x3a9/0x3e0 dohugepmdanonymouspage+0xd7/0x3e0 _handlemmfault+0x5e3/0x5f0 handlemmfault+0xf7/0x2e0 hmmvmafault.isra.0+0x4d/0xa0 walkpmdrange.isra.0+0xa8/0x310 walkpudrange+0x167/0x240 walkpgdrange+0x55/0x100 _walkpagerange+0x87/0x90 walkpagerange+0xf6/0x160 hmmrangefault+0x4f/0x90 amdgpuhmmrangegetpages+0x123/0x230 [amdgpu] amdgputtmttgetuserpages+0xb1/0x150 [amdgpu] inituserpages+0xb1/0x2a0 [amdgpu] amdgpuamdkfdgpuvmallocmemoryofgpu+0x543/0x7d0 [amdgpu] kfdioctlallocmemoryofgpu+0x24c/0x4e0 [amdgpu] kfdioctl+0x29d/0x500 [amdgpu]

(cherry picked from commit a29e067bd38946f752b0ef855f3dfff87e77bec7)

References

Affected packages