GHSA-p7j4-jwjf-5x9w

Suggest an improvement
Source
https://github.com/advisories/GHSA-p7j4-jwjf-5x9w
Import Source
https://github.com/github/advisory-database/blob/main/advisories/github-reviewed/2025/07/GHSA-p7j4-jwjf-5x9w/GHSA-p7j4-jwjf-5x9w.json
JSON Data
https://api.test.osv.dev/v1/vulns/GHSA-p7j4-jwjf-5x9w
Aliases
Published
2025-07-07T12:30:22Z
Modified
2025-07-08T00:27:14.951786Z
Severity
  • 5.3 (Medium) CVSS_V3 - CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:L/A:N CVSS Calculator
Summary
LlamaIndex vulnerability in ArxivReader class can cause MD5 hash collisions
Details

A vulnerability in the ArxivReader class of the run-llama/llama_index repository allows for MD5 hash collisions when generating filenames for downloaded papers. This can lead to data loss as papers with identical titles but different contents may overwrite each other, preventing some papers from being processed for AI model training. The issue is resolved in llama-index-readers-papers version 0.3.1 (in llama-index 0.12.28).

Database specific
{
    "github_reviewed": true,
    "github_reviewed_at": "2025-07-08T00:03:54Z",
    "cwe_ids": [
        "CWE-440"
    ],
    "nvd_published_at": "2025-07-07T10:15:26Z",
    "severity": "MODERATE"
}
References

Affected packages

PyPI / llama-index-readers-papers

Package

Name
llama-index-readers-papers
View open source insights on deps.dev
Purl
pkg:pypi/llama-index-readers-papers

Affected ranges

Type
ECOSYSTEM
Events
Introduced
0Unknown introduced version / All previous versions are affected
Fixed
0.3.1

Affected versions

0.*

0.0.1
0.1.0
0.1.1
0.1.2
0.1.3
0.1.4
0.1.5
0.1.6
0.2.0
0.3.0