In the Linux kernel, the following vulnerability has been resolved:
bpf, sockmap: Avoid using sk_socket after free when sending
The sk->sksocket is not locked or referenced in backlog thread, and during the call to skbsendsock(), there is a race condition with the release of sksocket. All types of sockets(tcp/udp/unix/vsock) will be affected.
Race conditions: ''' CPU0 CPU1
backlog::skbsendsock sendmsgunlocked socksendmsg socksendmsgnosec close(fd): ... ops->release() -> sockmapclose() sk_socket->ops = NULL free(socket) sock->ops->sendmsg ^ panic here '''
The ref of psock become 0 after sockmapclose() executed. ''' void sockmapclose() { ... if (likely(psock)) { ... // !! here we remove psock and the ref of psock become 0 sockmapremovelinks(sk, psock) psock = skpsockget(sk); if (unlikely(!psock)) goto nopsock; <=== Control jumps here via goto ... canceldelayedworksync(&psock->work); <=== not executed skpsock_put(sk, psock); ... } '''
Based on the fact that we already wait for the workqueue to finish in sockmapclose() if psock is held, we simply increase the psock reference count to avoid race conditions.
With this patch, if the backlog thread is running, sockmapclose() will wait for the backlog thread to complete and cancel all pending work.
If no backlog running, any pending work that hasn't started by then will fail when invoked by skpsockget(), as the psock reference count have been zeroed, and skpsockdrop() will cancel all jobs via canceldelayedwork_sync().
In summary, we require synchronization to coordinate the backlog thread and close() thread.
The panic I catched: ''' Workqueue: events skpsockbacklog RIP: 0010:socksendmsg+0x21d/0x440 RAX: 0000000000000000 RBX: ffffc9000521fad8 RCX: 0000000000000001 ... Call Trace: <TASK> ? dieaddr+0x40/0xa0 ? excgeneralprotection+0x14c/0x230 ? asmexcgeneralprotection+0x26/0x30 ? socksendmsg+0x21d/0x440 ? socksendmsg+0x3e0/0x440 ? _pfxsocksendmsg+0x10/0x10 _skbsendsock+0x543/0xb70 skpsock_backlog+0x247/0xb80 ... '''