In the Linux kernel, the following vulnerability has been resolved:
bpf: Fix UAF due to race between btftrygetmodule and loadmodule
While working on code to populate kfunc BTF ID sets for module BTF from its initcall, I noticed that by the time the initcall is invoked, the module BTF can already be seen by userspace (and the BPF verifier). The existing btftrygetmodule calls trymoduleget which only fails if mod->state == MODULESTATE_GOING, i.e. it can increment module reference when module initcall is happening in parallel.
Currently, BTF parsing happens from MODULESTATECOMING notifier callback. At this point, the module initcalls have not been invoked. The notifier callback parses and prepares the module BTF, allocates an ID, which publishes it to userspace, and then adds it to the btfmodules list allowing the kernel to invoke btftrygetmodule for the BTF.
However, at this point, the module has not been fully initialized (i.e. its initcalls have not finished). The code in module.c can still fail and free the module, without caring for other users. However, nothing stops btftrygetmodule from succeeding between the state transition from MODULESTATECOMING to MODULESTATE_LIVE.
This leads to a use-after-free issue when BPF program loads successfully in the state transition, loadmodule's doinitmodule call fails and frees the module, and BPF program fd on close calls moduleput for the freed module. Future patch has test case to verify we don't regress in this area in future.
There are multiple points after preparecomingmodule (in loadmodule) where failure can occur and module loading can return error. We illustrate and test for the race using the last point where it can practically occur (in module _init function).
An illustration of the race:
CPU 0 CPU 1 loadmodule notifiercall(MODULESTATECOMING) btfparsemodule btfallocid // Published to userspace listadd(&btfmod->list, btfmodules) mod->init(...) ... ^ bpfcheck | checkpseudobtfid | btftrygetmodule | returns true | ... ... | module _init in progress return progfd | ... ... V if (ret < 0) freemodule(mod) ... close(progfd) ... bpfprogfreedeferred moduleput(used_btf.mod) // use-after-free
We fix this issue by setting a flag BTFMODULEFLIVE, from the notifier callback when MODULESTATELIVE state is reached for the module, so that we return NULL from btftrygetmodule for modules that are not fully formed. Since trymoduleget already checks that module is not in MODULESTATEGOING state, and that is the only transition a live module can make before being removed from btf_modules list, this is enough to close the race and prevent the bug.
A later selftest patch crafts the race condition artifically to verify that it has been fixed, and that verifier fails to load program (with ENXIO).
Lastly, a couple of comments:
Even if this race didn't exist, it seems more appropriate to only access resources (ksyms and kfuncs) of a fully formed module which has been initialized completely.
This patch was born out of need for synchronization against module initcall for the next patch, so it is needed for correctness even without the aforementioned race condition. The BTF resources initialized by module initcall are set up once and then only looked up, so just waiting until the initcall has finished ensures correct behavior.