In the Linux kernel, the following vulnerability has been resolved:
block: fix deadlock between sdremove & sdrelease
Our test report the following hung task:
[ 2538.459400] INFO: task "kworker/0:0":7 blocked for more than 188 seconds. [ 2538.459427] Call trace: [ 2538.459430] _switchto+0x174/0x338 [ 2538.459436] _schedule+0x628/0x9c4 [ 2538.459442] schedule+0x7c/0xe8 [ 2538.459447] schedulepreemptdisabled+0x24/0x40 [ 2538.459453] _mutexlock+0x3ec/0xf04 [ 2538.459456] _mutexlockslowpath+0x14/0x24 [ 2538.459459] mutexlock+0x30/0xd8 [ 2538.459462] delgendisk+0xdc/0x350 [ 2538.459466] sdremove+0x30/0x60 [ 2538.459470] devicereleasedriverinternal+0x1c4/0x2c4 [ 2538.459474] devicereleasedriver+0x18/0x28 [ 2538.459478] busremovedevice+0x15c/0x174 [ 2538.459483] devicedel+0x1d0/0x358 [ 2538.459488] _scsiremovedevice+0xa8/0x198 [ 2538.459493] scsiforgethost+0x50/0x70 [ 2538.459497] scsiremovehost+0x80/0x180 [ 2538.459502] usbstordisconnect+0x68/0xf4 [ 2538.459506] usbunbindinterface+0xd4/0x280 [ 2538.459510] devicereleasedriverinternal+0x1c4/0x2c4 [ 2538.459514] devicereleasedriver+0x18/0x28 [ 2538.459518] busremovedevice+0x15c/0x174 [ 2538.459523] devicedel+0x1d0/0x358 [ 2538.459528] usbdisabledevice+0x84/0x194 [ 2538.459532] usbdisconnect+0xec/0x300 [ 2538.459537] hubevent+0xb80/0x1870 [ 2538.459541] processscheduledworks+0x248/0x4dc [ 2538.459545] worker_thread+0x244/0x334 [ 2538.459549] kthread+0x114/0x1bc
[ 2538.461001] INFO: task "fsck.":15415 blocked for more than 188 seconds. [ 2538.461014] Call trace: [ 2538.461016] _switchto+0x174/0x338 [ 2538.461021] _schedule+0x628/0x9c4 [ 2538.461025] schedule+0x7c/0xe8 [ 2538.461030] blkqueueenter+0xc4/0x160 [ 2538.461034] blkmqallocrequest+0x120/0x1d4 [ 2538.461037] scsiexecutecmd+0x7c/0x23c [ 2538.461040] ioctlinternalcommand+0x5c/0x164 [ 2538.461046] scsisetmediumremoval+0x5c/0xb0 [ 2538.461051] sdrelease+0x50/0x94 [ 2538.461054] blkdevput+0x190/0x28c [ 2538.461058] blkdevrelease+0x28/0x40 [ 2538.461063] _fput+0xf8/0x2a8 [ 2538.461066] _fputsync+0x28/0x5c [ 2538.461070] _arm64sysclose+0x84/0xe8 [ 2538.461073] invokesyscall+0x58/0x114 [ 2538.461078] el0svccommon+0xac/0xe0 [ 2538.461082] doel0svc+0x1c/0x28 [ 2538.461087] el0svc+0x38/0x68 [ 2538.461090] el0t64synchandler+0x68/0xbc [ 2538.461093] el0t64_sync+0x1a8/0x1ac
T1: T2: sdremove delgendisk _blkmarkdiskdead blkfreezequeuestart ++q->mqfreezedepth bdevrelease mutexlock(&disk->openmutex) sdrelease scsiexecutecmd blkqueueenter waitevent(!q->mqfreezedepth) mutexlock(&disk->openmutex)
SCSI does not set GDOWNSQUEUE, so QUEUEFLAGDYING is not set in this scenario. This is a classic ABBA deadlock. To fix the deadlock, make sure we don't try to acquire disk->open_mutex after freezing the queue.