GHSA-gpfh-jvf9-7wg5

Suggest an improvement
Source
https://github.com/advisories/GHSA-gpfh-jvf9-7wg5
Import Source
https://github.com/github/advisory-database/blob/main/advisories/github-reviewed/2021/11/GHSA-gpfh-jvf9-7wg5/GHSA-gpfh-jvf9-7wg5.json
JSON Data
https://api.osv.dev/v1/vulns/GHSA-gpfh-jvf9-7wg5
Aliases
Published
2021-11-10T18:51:21Z
Modified
2023-12-06T00:46:33.275821Z
Severity
  • 7.8 (High) CVSS_V3 - CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H CVSS Calculator
Summary
Use after free / memory leak in `CollectiveReduceV2`
Details

Impact

The async implementation of CollectiveReduceV2 suffers from a memory leak and a use after free:

import tensorflow as tf

tf.raw_ops.CollectiveReduceV2(
  input=[],
  group_size=[-10, -10, -10],
  group_key=[-10, -10],
  instance_key=[-10],
  ordering_token=[],
  merge_op='Mul',
  final_op='Div')

This occurs due to the asynchronous computation and the fact that objects that have been std::move()d from are still accessed:

auto done_with_cleanup = [col_params, done = std::move(done)]() {
  done();
  col_params->Unref();
};
OP_REQUIRES_OK_ASYNC(c,
                     FillCollectiveParams(col_params, REDUCTION_COLLECTIVE,
                                          /*group_size*/ c->input(1),
                                          /*group_key*/ c->input(2),
                                          /*instance_key*/ c->input(3)),
                     done);

Here, done is already moved from by the time OP_REQUIRES_OK_ASYNC macro needs to invoke it in case of errors. In this case, we get an undefined behavior, which can manifest via crashes, std::bad_alloc throws or just memory leaks.

Patches

We have patched the issue in GitHub commit ca38dab9d3ee66c5de06f11af9a4b1200da5ef75.

The fix will be included in TensorFlow 2.7.0. We will also cherrypick this commit on TensorFlow 2.6.1, as this version is the only one that is also affected.

For more information

Please consult our security guide for more information regarding the security model and how to contact us with issues and questions.

Attribution

This vulnerability has been reported by members of the Aivul Team from Qihoo 360.

References

Affected packages

PyPI / tensorflow

Package

Affected ranges

Type
ECOSYSTEM
Events
Introduced
2.6.0
Fixed
2.6.1

Affected versions

2.*

2.6.0

PyPI / tensorflow-cpu

Package

Affected ranges

Type
ECOSYSTEM
Events
Introduced
2.6.0
Fixed
2.6.1

Affected versions

2.*

2.6.0

PyPI / tensorflow-gpu

Package

Affected ranges

Type
ECOSYSTEM
Events
Introduced
2.6.0
Fixed
2.6.1

Affected versions

2.*

2.6.0