In the Linux kernel, the following vulnerability has been resolved:
fs: dlm: fix invalid derefence of sb_lvbptr
I experience issues when putting a lkbsb on the stack and have sblvbptr field to a dangled pointer while not using DLMLKF_VALBLK. It will crash with the following kernel message, the dangled pointer is here 0xdeadbeef as example:
[ 102.749317] BUG: unable to handle page fault for address: 00000000deadbeef [ 102.749320] #PF: supervisor read access in kernel mode [ 102.749323] #PF: errorcode(0x0000) - not-present page [ 102.749325] PGD 0 P4D 0 [ 102.749332] Oops: 0000 [#1] PREEMPT SMP PTI [ 102.749336] CPU: 0 PID: 1567 Comm: locktorturewr Tainted: G W 5.19.0-rc3+ #1565 [ 102.749343] Hardware name: Red Hat KVM/RHEL-AV, BIOS 1.16.0-2.module+el8.7.0+15506+033991b0 04/01/2014 [ 102.749344] RIP: 0010:memcpyerms+0x6/0x10 [ 102.749353] Code: cc cc cc cc eb 1e 0f 1f 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1 <f3> a4 c3 0f 1f 80 00 00 00 00 48 89 f8 48 83 fa 20 72 7e 40 38 fe [ 102.749355] RSP: 0018:ffff97a58145fd08 EFLAGS: 00010202 [ 102.749358] RAX: ffff901778b77070 RBX: 0000000000000000 RCX: 0000000000000040 [ 102.749360] RDX: 0000000000000040 RSI: 00000000deadbeef RDI: ffff901778b77070 [ 102.749362] RBP: ffff97a58145fd10 R08: ffff901760b67a70 R09: 0000000000000001 [ 102.749364] R10: ffff9017008e2cb8 R11: 0000000000000001 R12: ffff901760b67a70 [ 102.749366] R13: ffff901760b78f00 R14: 0000000000000003 R15: 0000000000000001 [ 102.749368] FS: 0000000000000000(0000) GS:ffff901876e00000(0000) knlGS:0000000000000000 [ 102.749372] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 102.749374] CR2: 00000000deadbeef CR3: 000000017c49a004 CR4: 0000000000770ef0 [ 102.749376] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 102.749378] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 102.749379] PKRU: 55555554 [ 102.749381] Call Trace: [ 102.749382] <TASK> [ 102.749383] ? sendargs+0xb2/0xd0 [ 102.749389] sendcommon+0xb7/0xd0 [ 102.749395] unlocklock+0x2c/0x90 [ 102.749400] unlocklock.isra.56+0x62/0xa0 [ 102.749405] dlmunlock+0x21e/0x330 [ 102.749411] ? locktorturestats+0x80/0x80 [dlmlocktorture] [ 102.749416] tortureunlock+0x5a/0x90 [dlmlocktorture] [ 102.749419] ? preemptcountsub+0xba/0x100 [ 102.749427] locktorturewriter+0xbd/0x150 [dlmlocktorture] [ 102.786186] kthread+0x10a/0x130 [ 102.786581] ? kthreadcompleteandexit+0x20/0x20 [ 102.787156] retfromfork+0x22/0x30 [ 102.787588] </TASK> [ 102.787855] Modules linked in: dlmlocktorture torture rpcsecgsskrb5 intelraplmsr intelraplcommon kvmintel iTCOwdt iTCOvendorsupport kvm vmwvsockvirtiotransport qxl irqbypass vmwvsockvirtiotransportcommon drmttmhelper crc32pclmul joydev crc32cintel ttm vsock virtioscsi virtioballoon sndpcm drmkmshelper virtioconsole sndtimer snd drm soundcore syscopyarea i2ci801 sysfillrect sysimgblt i2csmbus pcspkr fbsysfops lpcich serioraw [ 102.792536] CR2: 00000000deadbeef [ 102.792930] ---[ end trace 0000000000000000 ]---
This patch fixes the issue by checking also on DLMLKFVALBLK on exflags is set when copying the lvbptr array instead of if it's just null which fixes for me the issue.
I think this patch can fix other dlm users as well, depending how they handle the init, freeing memory handling of sblvbptr and don't set DLMLKFVALBLK for some dlmlock() calls. It might a there could be a hidden issue all the time. However with checking on DLMLKFVALBLK the user always need to provide a sblvbptr non-null value. There might be more intelligent handling between per ls lvblen, DLMLKF_VALBLK and non-null to report the user the way how DLM API is used is wrong but can be added for later, this will only fix the current behaviour.