CVE-2024-57975

BTRFS error (device dm-3): cowfilerange failed, start 1146880 end 1253375 len 106496 ret -28 BTRFS error (device dm-3): rundelallocnocow failed, start 1146880 end 1253375 len 106496 ret -28 page: refcount:4 mapcount:0 mapping:00000000592787cc index:0x12 pfn:0x10664 aops:btrfsaops [btrfs] ino:101 dentry name(?):"f1774" flags: 0x2fffff80004028(uptodate|lru|private|node=0|zone=2|lastcpupid=0xfffff) page dumped because: VMBUGONFOLIO(!foliotestlocked(folio)) ------------[ cut here ]------------ kernel BUG at mm/page-writeback.c:2992! Internal error: Oops - BUG: 00000000f2000800 [#1] SMP CPU: 2 UID: 0 PID: 3943513 Comm: kworker/u24:15 Tainted: G OE 6.12.0-rc7-custom+ #87 Tainted: [O]=OOTMODULE, [E]=UNSIGNEDMODULE Hardware name: QEMU KVM Virtual Machine, BIOS unknown 2/2/2022 Workqueue: eventsunbound btrfsasyncreclaimdataspace [btrfs] pc : foliocleardirtyforio+0x128/0x258 lr : foliocleardirtyforio+0x128/0x258 Call trace: foliocleardirtyforio+0x128/0x258 btrfsfolioclampclear_dirty+0x80/0xd0 [btrfs] _processfolioscontig+0x154/0x268 [btrfs] extentclearunlockdelalloc+0x5c/0x80 [btrfs] rundelallocnocow+0x5f8/0x760 [btrfs] btrfsrundelallocrange+0xa8/0x220 [btrfs] writepagedelalloc+0x230/0x4c8 [btrfs] extentwritepage+0xb8/0x358 [btrfs] extentwritecachepages+0x21c/0x4e8 [btrfs] btrfswritepages+0x94/0x150 [btrfs] dowritepages+0x74/0x190 filemapfdatawritewbc+0x88/0xc8 startdelallocinodes+0x178/0x3a8 [btrfs] btrfsstartdelallocroots+0x174/0x280 [btrfs] shrinkdelalloc+0x114/0x280 [btrfs] flushspace+0x250/0x2f8 [btrfs] btrfsasyncreclaimdataspace+0x180/0x228 [btrfs] processonework+0x164/0x408 workerthread+0x25c/0x388 kthread+0x100/0x118 retfromfork+0x10/0x20 Code: 910a8021 a90363f7 a9046bf9 94012379 (d4210000) ---[ end trace 0000000000000000 ]---

[CAUSE] The first two lines of extra debug messages show the problem is caused by the error handling of rundelallocnocow().

E.g. we have the following dirtied range (4K blocksize 4K page size):

0                 16K                  32K
|//////////////////////////////////////|
|  Pre-allocated  |

And the range [0, 16K) has a preallocated extent.

Enter rundelallocnocow() for range [0, 16K) Which found range [0, 16K) is preallocated, can do the proper NOCOW write.
Enter fallbacktofow() for range [16K, 32K) Since the range [16K, 32K) is not backed by preallocated extent, we have to go COW.
cowfilerange() failed for range [16K, 32K) So cowfilerange() will do the clean up by clearing folio dirty, unlock the folios.

Now the folios in range [16K, 32K) is unlocked.
Enter extentclearunlockdelalloc() from rundelallocnocow() Which is called with PAGESTARTWRITEBACK to start page writeback. But folios can only be marked writeback when it's properly locked, thus this triggered the VMBUGONFOLIO().

Furthermore there is another hidden but common bug that rundelallocnocow() is not clearing the folio dirty flags in its error handling path. This is the common bug shared between rundelallocnocow() and cowfilerange().

[FIX] - Clear folio dirty for range [@start, @curoffset) Introduce a helper, cleanupdirtyfolios(), which will find and lock the folio in the range, clear the dirty flag and start/end the writeback, with the extra handling for the @lockedfolio.

Introduce a helper to clear folio dirty, start and end writeback
Introduce a helper to record the last failed COW range end This is to trace which range we should skip, to avoid double unlocking.
Skip the failed COW range for the e ---truncated---

Database specific

{
    "osv_generated_from": "https://github.com/CVEProject/cvelistV5/tree/main/cves/2024/57xxx/CVE-2024-57975.json",
    "cna_assigner": "Linux"
}

References

Affected packages

Git / git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git

Affected ranges

Type: GIT
Repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
Events: Introduced

17ca04aff7e6171df684b7b65804df8830eb8c15

Fixed

5ae72abbf91eb172ce3a838a4dc34be3c9707296

Fixed

2434533f1c963e7317c45880c98287e5bed98325

Fixed

c2b47df81c8e20a8e8cd94f0d7df211137ae94ed

Database specific

source

"https://storage.googleapis.com/osv-test-cve-osv-conversion/osv-output/CVE-2024-57975.json"

Linux / Kernel

Package

Name: Kernel

Affected ranges

Type: ECOSYSTEM
Events: Introduced

3.5.0

Fixed

6.12.13

Type: ECOSYSTEM
Events: Introduced

6.13.0

Fixed

6.13.2

Database specific

source

"https://storage.googleapis.com/osv-test-cve-osv-conversion/osv-output/CVE-2024-57975.json"