In the Linux kernel, the following vulnerability has been resolved:
sched/core: Prevent rescheduling when interrupts are disabled
David reported a warning observed while loop testing kexec jump:
Interrupts enabled after irqrouterresume+0x0/0x50 WARNING: CPU: 0 PID: 560 at drivers/base/syscore.c:103 syscoreresume+0x18a/0x220 kernelkexec+0xf6/0x180 _dosysreboot+0x206/0x250 dosyscall64+0x95/0x180
The corresponding interrupt flag trace:
hardirqs last enabled at (15573): [<ffffffffa8281b8e>] _upconsolesem+0x7e/0x90 hardirqs last disabled at (15580): [<ffffffffa8281b73>] _upconsolesem+0x63/0x90
That means _upconsolesem() was invoked with interrupts enabled. Further instrumentation revealed that in the interrupt disabled section of kexec jump one of the syscoresuspend() callbacks woke up a task, which set the NEEDRESCHED flag. A later callback in the resume path invoked condresched() which in turn led to the invocation of the scheduler:
_condresched+0x21/0x60 downtimeout+0x18/0x60 acpioswaitsemaphore+0x4c/0x80 acpiutacquiremutex+0x3d/0x100 acpinsgetnode+0x27/0x60 acpinsevaluate+0x1cb/0x2d0 acpirssetsrsmethoddata+0x156/0x190 acpipcilinkset+0x11c/0x290 irqrouterresume+0x54/0x60 syscoreresume+0x6a/0x200 kernelkexec+0x145/0x1c0 _dosysreboot+0xeb/0x240 dosyscall64+0x95/0x180
This is a long standing problem, which probably got more visible with the recent printk changes. Something does a task wakeup and the scheduler sets the NEEDRESCHED flag. condresched() sees it set and invokes schedule() from a completely bogus context. The scheduler enables interrupts after context switching, which causes the above warning at the end.
Quite some of the code paths in syscore_suspend()/resume() can result in triggering a wakeup with the exactly same consequences. They might not have done so yet, but as they share a lot of code with normal operations it's just a question of time.
The problem only affects the PREEMPTNONE and PREEMPTVOLUNTARY scheduling models. Full preemption is not affected as cond_resched() is disabled and the preemption check preemptible() takes the interrupt disabled flag into account.
Cure the problem by adding a corresponding check into cond_resched().