In the Linux kernel, the following vulnerability has been resolved:
wifi: rtlwifi: Drastically reduce the attempts to read efuse in case of failures
Syzkaller reported a hung task with uevent_show() on stack trace. That specific issue was addressed by another commit [0], but even with that fix applied (for example, running v6.12-rc5) we face another type of hung task that comes from the same reproducer [1]. By investigating that, we could narrow it to the following path:
(a) Syzkaller emulates a Realtek USB WiFi adapter using raw-gadget and dummy_hcd infrastructure.
(b) During the probe of rtl8192cu, the driver ends-up performing an efuse read procedure (which is related to EEPROM load IIUC), and here lies the issue: the function readefuse() calls readefuse_byte() many times, as loop iterations depending on the efuse size (in our example, 512 in total).
This procedure for reading efuse bytes relies in a loop that performs an I/O read up to 10k times in case of failures. We measured the time of the loop inside readefusebyte() alone, and in this reproducer (which involves the dummy_hcd emulation layer), it takes 15 seconds each. As a consequence, we have the driver stuck in its probe routine for big time, exposing a stack trace like below if we attempt to reboot the system, for example:
task:kworker/0:3 state:D stack:0 pid:662 tgid:662 ppid:2 flags:0x00004000 Workqueue: usbhubwq hubevent Call Trace: _schedule+0xe22/0xeb6 scheduletimeout+0xe7/0x132 _waitforcommon+0xb5/0x12e usbstartwaiturb+0xc5/0x1ef ? usballocurb+0x95/0xa4 usbcontrolmsg+0xff/0x184 _usbctrlvendorreqsync+0xa0/0x161 _usbreadsync+0xb3/0xc5 readefusebyte+0x13c/0x146 readefuse+0x351/0x5f0 efusereadallmap+0x42/0x52 rtlefuseshadowmapupdate+0x60/0xef rtlgethwinfo+0x5d/0x1c2 rtl92cureadeeprominfo+0x10a/0x8d5 ? rtl92creadchipversion+0x14f/0x17e rtlusbprobe+0x323/0x851 usbprobeinterface+0x278/0x34b reallyprobe+0x202/0x4a4 _driverprobedevice+0x166/0x1b2 driverprobe_device+0x2f/0xd8 [...]
We propose hereby to drastically reduce the attempts of doing the I/O reads in case of failures, restricted to USB devices (given that they're inherently slower than PCIe ones). By retrying up to 10 times (instead of 10000), we got reponsiveness in the reproducer, while seems reasonable to believe that there's no sane USB device implementation in the field requiring this amount of retries at every I/O read in order to properly work. Based on that assumption, it'd be good to have it backported to stable but maybe not since driver implementation (the 10k number comes from day 0), perhaps up to 6.x series makes sense.
[0] Commit 15fffc6a5624 ("driver core: Fix uevent_show() vs driver detach race")
[1] A note about that: this syzkaller report presents multiple reproducers that differs by the type of emulated USB device. For this specific case, check the entry from 2024/08/08 06:23 in the list of crashes; the C repro is available at https://syzkaller.appspot.com/text?tag=ReproC&x=1521fc83980000.