Security-sensitive parts of the Python HTTP parser retained minor differences in allowable character sets, that must trigger error handling to robustly match frame boundaries of proxies in order to protect against injection of additional requests. Additionally, validation could trigger exceptions that were not handled consistently with processing of other malformed input.
These problems are rooted in pattern matching protocol elements, previously improved by PR #3235 and GHSA-gfw2-4jvh-wgfg:
The expression HTTP/(\d).(\d)
lacked another backslash to clarify that the separator should be a literal dot, not just any Unicode code point (result: HTTP/(\d)\.(\d)
).
The HTTP version was permitting Unicode digits, where only ASCII digits are standards-compliant.
Distinct regular expressions for validating HTTP Method and Header field names were used - though both should (at least) apply the common restrictions of rfc9110 token
.
GET / HTTP/1ö1
GET / HTTP/1.𝟙
GET/: HTTP/1.1
Content-Encoding?: chunked
Primarily concerns running an aiohttp server without llhttp: 1. behind a proxy: Being more lenient than internet standards require could, depending on deployment environment, assist in request smuggling. 2. directly accessible or exposed behind proxies relaying malformed input: the unhandled exception could cause excessive resource consumption on the application server and/or its logging facilities.
Patch: https://github.com/aio-libs/aiohttp/pull/8074/files