In the Linux kernel, the following vulnerability has been resolved:
btrfs: do not clear page dirty inside extentwritelocked_range()
[BUG] For subpage + zoned case, the following workload can lead to rsv data leak at unmount time:
# mkfs.btrfs -f -s 4k $dev # mount $dev $mnt # fsstress -w -n 8 -d $mnt -s 1709539240 0/0: fiemap - no filename 0/1: copyrange read - no filename 0/2: write - no filename 0/3: rename - no source filename 0/4: creat f0 x:0 0 0 0/4: creat add id=0,parent=-1 0/5: writev f0[259 1 0 0 0 0] [778052,113,965] 0 0/6: ioctl(FIEMAP) f0[259 1 0 0 224 887097] [1294220,2291618343991484791,0x10000] -1 0/7: dwrite - xfsctl(XFSIOCDIOINFO) f0[259 1 0 0 224 887097] return 25, fallback to stat() 0/7: dwrite f0[259 1 0 0 224 887097] [696320,102400] 0 # umount $mnt
The dmesg includes the following rsv leak detection warning (all call trace skipped):
------------[ cut here ]------------ WARNING: CPU: 2 PID: 4528 at fs/btrfs/inode.c:8653 btrfsdestroyinode+0x1e0/0x200 [btrfs] ---[ end trace 0000000000000000 ]--- ------------[ cut here ]------------ WARNING: CPU: 2 PID: 4528 at fs/btrfs/inode.c:8654 btrfsdestroyinode+0x1a8/0x200 [btrfs] ---[ end trace 0000000000000000 ]--- ------------[ cut here ]------------ WARNING: CPU: 2 PID: 4528 at fs/btrfs/inode.c:8660 btrfsdestroyinode+0x1a0/0x200 [btrfs] ---[ end trace 0000000000000000 ]--- BTRFS info (device sda): last unmount of filesystem 1b4abba9-de34-4f07-9e7f-157cf12a18d6 ------------[ cut here ]------------ WARNING: CPU: 3 PID: 4528 at fs/btrfs/block-group.c:4434 btrfsfreeblockgroups+0x338/0x500 [btrfs] ---[ end trace 0000000000000000 ]--- BTRFS info (device sda): spaceinfo DATA has 268218368 free, is not full BTRFS info (device sda): spaceinfo total=268435456, used=204800, pinned=0, reserved=0, mayuse=12288, readonly=0 zoneunusable=0 BTRFS info (device sda): globalblockrsv: size 0 reserved 0 BTRFS info (device sda): transblockrsv: size 0 reserved 0 BTRFS info (device sda): chunkblockrsv: size 0 reserved 0 BTRFS info (device sda): delayedblockrsv: size 0 reserved 0 BTRFS info (device sda): delayedrefsrsv: size 0 reserved 0 ------------[ cut here ]------------ WARNING: CPU: 3 PID: 4528 at fs/btrfs/block-group.c:4434 btrfsfreeblockgroups+0x338/0x500 [btrfs] ---[ end trace 0000000000000000 ]--- BTRFS info (device sda): spaceinfo METADATA has 267796480 free, is not full BTRFS info (device sda): spaceinfo total=268435456, used=131072, pinned=0, reserved=0, mayuse=262144, readonly=0 zoneunusable=245760 BTRFS info (device sda): globalblockrsv: size 0 reserved 0 BTRFS info (device sda): transblockrsv: size 0 reserved 0 BTRFS info (device sda): chunkblockrsv: size 0 reserved 0 BTRFS info (device sda): delayedblockrsv: size 0 reserved 0 BTRFS info (device sda): delayedrefsrsv: size 0 reserved 0
Above $dev is a tcmu-runner emulated zoned HDD, which has a max zone append size of 64K, and the system has 64K page size.
[CAUSE] I have added several trace_printk() to show the events (header skipped):
btrfsdirtypages: r/i=5/259 dirty start=774144 len=114688 btrfsdirtypages: r/i=5/259 dirty part of page=720896 offinpage=53248 leninpage=12288 btrfsdirtypages: r/i=5/259 dirty part of page=786432 offinpage=0 leninpage=65536 btrfsdirtypages: r/i=5/259 dirty part of page=851968 offinpage=0 leninpage=36864
The above lines show our buffered write has dirtied 3 pages of inode 259 of root 5:
704K 768K 832K 896K I |////I/////////////////I///////////| I 756K 868K
|///| is the dirtied range using subpage bitmaps. and 'I' is the page boundary.
Meanwhile all three pages (704K, 768K, 832K) have their PageDirty flag set.
btrfsdirectwrite: r/i=5/259 start dio filepos=696320 len=102400
Then direct IO writ ---truncated---