In the Linux kernel, the following vulnerability has been resolved:
dmaengine: hisilicon: Add multi-thread support for a DMA channel
When we get a DMA channel and try to use it in multiple threads it will cause oops and hanging the system.
% echo 100 > /sys/module/dmatest/parameters/threadsperchan % echo 100 > /sys/module/dmatest/parameters/iterations % echo 1 > /sys/module/dmatest/parameters/run [383493.327077] Unable to handle kernel paging request at virtual address dead000000000108 [383493.335103] Mem abort info: [383493.335103] ESR = 0x96000044 [383493.335105] EC = 0x25: DABT (current EL), IL = 32 bits [383493.335107] SET = 0, FnV = 0 [383493.335108] EA = 0, S1PTW = 0 [383493.335109] FSC = 0x04: level 0 translation fault [383493.335110] Data abort info: [383493.335111] ISV = 0, ISS = 0x00000044 [383493.364739] CM = 0, WnR = 1 [383493.367793] [dead000000000108] address between user and kernel address ranges [383493.375021] Internal error: Oops: 96000044 [#1] PREEMPT SMP [383493.437574] CPU: 63 PID: 27895 Comm: dma0chan0-copy2 Kdump: loaded Tainted: GO 5.17.0-rc4+ #2 [383493.457851] pstate: 204000c9 (nzCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [383493.465331] pc : vchantxsubmit+0x64/0xa0 [383493.469957] lr : vchantxsubmit+0x34/0xa0
This occurs because the transmission timed out, and that's due to data race. Each thread rewrite channels's descriptor as soon as deviceissuepending is called. It leads to the situation that the driver thinks that it uses the right descriptor in interrupt handler while channels's descriptor has been changed by other thread. The descriptor which in fact reported interrupt will not be handled any more, as well as its tx->callback. That's why timeout reports.
With current fixes channels' descriptor changes it's value only when it has been used. A new descriptor is acquired from vc->descissued queue that is already filled with descriptors that are ready to be sent. Threads have no direct access to DMA channel descriptor. In case of channel's descriptor is busy, try to submit to HW again when a descriptor is completed. In this case, vc->descissued may be empty when hisidmastart_transfer is called, so delete error reporting on this. Now it is just possible to queue a descriptor for further processing.