LeftoverLocals vulnerability leak LLM responses to other users on same GPUs
Take action: The exploit is very specific and requires that the attacker is running their own code on the same GPU. Yet this is not too far-fetched with usage of GPU services. As architects of shared GPU services, review your GPU for LeftoverLocals vulnerability and fixes. As users, confirm with your GPU provider that they protect you from LeftoverLocals
A security vulnerability named 'LeftoverLocals', identified in graphics processing units (GPUs) from AMD, Apple, Qualcomm, and Imagination Technologies, has been discovered, allowing for the retrieval of data from local memory spaces.
Classified as CVE-2023-4969, this flaw particularly affects large language models (LLMs) and machine learning (ML) processes by enabling data recovery from affected GPUs. Trail of Bits researchers Tyler Sorensen and Heidy Khlaaf initially found this issue, reporting it to the vendors before releasing a technical summary.
The vulnerability arises due to some GPU frameworks not fully isolating memory, enabling a kernel to read data from local memory left by another kernel. Attackers can exploit this by using GPU compute applications to access and dump data from uninitialized local memory. Specifically, an attacker's 'listener' kernel can read data left behind by another user's 'writer' kernel in the GPU's local memory, potentially revealing sensitive computational details such as model inputs, outputs, and intermediate computations.
Trail of Bits demonstrated that an adversary could recover substantial amounts of data per GPU invocation through a proof of concept. For instance, on an AMD Radeon RX 7900 XT, up to 181MB could be retrieved per query, enough to reconstruct LLM responses accurately.
This vulnerability was reported to CERT/CC by Trail of Bits in September 2023 for coordinated disclosure and patching. Various vendors are responding differently: some have issued fixes, while others, like AMD and Qualcomm, are still working on effective mitigation strategies. Imagination Technologies released a fix, but some of their GPUs remain vulnerable, as noted by Google in January 2024. Contrarily, Intel, NVIDIA, and ARM reported their GPUs as unaffected.
Trail of Bits suggests that GPU vendors should implement an automatic local memory clearing mechanism between kernel calls to ensure data isolation. Although this may cause some performance reduction, the security benefit is significant. Additional mitigations include avoiding multi-tenant GPU environments in sensitive scenarios and applying user-level safeguards.