In the Linux kernel, the following vulnerability has been resolved:
rcu: Fix rcureadunlock() deadloop due to IRQ work
During rcureadunlockspecial(), if this happens during irqexit(), we can lockup if an IPI is issued. This is because the IPI itself triggers the irq_exit() path causing a recursive lock up.
This is precisely what Xiongfeng found when invoking a BPF program on the tracetickstop() tracepoint As shown in the trace below. Fix by managing the irq_work state correctly.
irqexit() _irqexitrcu() /* inhardirq() returns false after this */ preemptcountsub(HARDIRQOFFSET) tickirqexit() ticknohzirqexit() ticknohzstopschedtick() tracetickstop() /* a bpf prog is hooked on this trace point */ _bpftracetickstop() bpftracerun2() rcureadunlockspecial() /* will send a IPI to itself */ irqworkqueueon(&rdp->deferqs_iw, rdp->cpu);
A simple reproducer can also be obtained by doing the following in tickirqexit(). It will hang on boot without the patch:
static inline void tickirqexit(void) { + rcureadlock(); + WRITEONCE(current->rcureadunlockspecial.b.needqs, true); + rcuread_unlock(); +
[neeraj: Apply Frederic's suggested fix for PREEMPT_RT]