In the Linux kernel, the following vulnerability has been resolved:
workqueue: Do not warn when cancelling WQMEMRECLAIM work from !WQMEMRECLAIM worker
After commit 746ae46c1113 ("drm/sched: Mark scheduler work queues with WQMEMRECLAIM") amdgpu started seeing the following warning:
[ ] workqueue: WQMEMRECLAIM sdma0:drmschedrunjobwork [gpusched] is flushing !WQMEMRECLAIM events:amdgpudevicedelayenablegfxoff [amdgpu] ... [ ] Workqueue: sdma0 drmschedrunjobwork [gpusched] ... [ ] Call Trace: [ ] <TASK> ... [ ] ? checkflushdependency+0xf5/0x110 ... [ ] canceldelayedworksync+0x6e/0x80 [ ] amdgpugfxoffctrl+0xab/0x140 [amdgpu] [ ] amdgpuringalloc+0x40/0x50 [amdgpu] [ ] amdgpuibschedule+0xf4/0x810 [amdgpu] [ ] ? drmschedrunjobwork+0x22c/0x430 [gpusched] [ ] amdgpujobrun+0xaa/0x1f0 [amdgpu] [ ] drmschedrunjobwork+0x257/0x430 [gpusched] [ ] processone_work+0x217/0x720 ... [ ] </TASK>
The intent of the verifcation done in checkflushdepedency is to ensure forward progress during memory reclaim, by flagging cases when either a memory reclaim process, or a memory reclaim work item is flushed from a context not marked as memory reclaim safe.
This is correct when flushing, but when called from the cancel(delayed)work_sync() paths it is a false positive because work is either already running, or will not be running at all. Therefore cancelling it is safe and we can relax the warning criteria by letting the helper know of the calling context.
References: 746ae46c1113 ("drm/sched: Mark scheduler work queues with WQMEMRECLAIM")