AMD, Apple, Qualcomm GPUs leak AI information in LeftoverLocals assaults

AMD, Apple, Qualcomm GPUs leak AI information in LeftoverLocals assaults

A brand new vulnerability dubbed ‘LeftoverLocals’ affecting graphics processing models from AMD, Apple, Qualcomm, and Creativeness Applied sciences permits retrieving information from the native reminiscence house.

Tracked as CVE-2023-4969, the safety difficulty permits information restoration from susceptible GPUs, particularly within the context of enormous language fashions (LLMs) and machine studying (ML) processes.

LeftoverLocals was found by Path of Bits researchers Tyler Sorensen and Heidy Khlaaf, who reported it privately to the distributors earlier than publishing a technical overview.

LeftoverLocals particulars

The safety flaw stems from the truth that some GPU frameworks don’t isolate reminiscence utterly and one kernel working on the machine might learn values in native reminiscence written by one other kernel.

Path of Bits researchers Tyler Sorensen and Heidy Khlaaf, who found and reported the vulnerability, clarify that an adversary solely must run a GPU compute utility (e.g. OpenCL, Vulkan, Steel) to learn information a consumer left within the GPU native reminiscence.

“Utilizing these, the attacker can learn information that the sufferer has left within the GPU native reminiscence just by writing a GPU kernel that dumps uninitialized native reminiscence” – Path of Bits

LeftoverLocals lets attackers launch a ‘listener’ – a GPU kernel that reads from uninitialized native reminiscence and may dump the info in a persistent location, comparable to the worldwide reminiscence.

If the native reminiscence is just not cleared, the attacker can use the listener to learn values left behind by the ‘author’ – a program that shops values to native reminiscence.

The animation beneath exhibits how the author and listener packages work together and the way the latter can retrieve information from the previous on affected GPUs.

writer process - AMD, Apple, Qualcomm GPUs leak AI information in LeftoverLocals assaults

The recovered information can reveal delicate details about the sufferer’s computations, together with mannequin inputs, outputs, weights, and intermediate computations.

In a multi-tenant GPU context that run LLMs, LeftoverLocals can be utilized to pay attention to different customers’ interactive classes and get well from the GPU’s native reminiscence the info from the sufferer’s “author” course of.

The Path of Bits researchers have created a proof of idea (PoC) to exhibit LeftoverLocals and confirmed that an adversary can get well 5.5MB of knowledge per GPU invocation, relying on the GPU framework.

On an AMD Radeon RX 7900 XT powering the open-source LLM llama.cpp, an attacker can get as a lot as 181MB per question, which is ample to reconstruct the LLM’s responses with excessive accuracy.

Affect and remediation

Path of Bits researchers found CVE-2023-4969 in September 2023 and knowledgeable CERT/CC to assist coordinate the disclosure and patching efforts.

Mitigation efforts are underway as some distributors already mounted it whereas others are nonetheless engaged on a solution to develop and implement a protection mechanism.

Within the case of Apple, the newest iPhone 15 is unaffected and fixes turned accessible for A17 and M3 processors however the difficulty persist on M2-powered computer systems.

AMD knowledgeable that the next GPU fashions stay susceptible as its engineers examine efficient mitigation methods.

Qualcomm has launched a patch by way of firmware v2.0.7 that fixes LeftoverLocals in some chips however others stay susceptible.

Creativeness launched a repair in DDK v23.3 in December 2023. Nevertheless, Google warned in January 2024 that among the vendor’s GPUs are nonetheless impacted.

Intel, NVIDIA, and ARM GPUs have reported that the info leak downside does not affect their gadgets.

Path of Bits means that GPU distributors implement an automated native reminiscence clearing mechanism between kernel calls, making certain isolation of delicate information written by one course of.

Whereas this method would possibly introduce some efficiency overhead, the researchers recommend that the trade-off is justified given the severity of the safety implications.

Different potential mitigations embody avoiding multi-tenant GPU environments in security-critical situations and implementing user-level mitigations.

You must be logged in to post a comment Login