How many security vulnerabilities were discovered in NVIDIA Triton Inference Server?

NVIDIA patched a total of 17 security vulnerabilities in Triton Inference Server, ranging from critical vulnerabilities with CVSS scores of 9.8 down to lower severity issues with scores of 4.4.

What are the most critical vulnerabilities in NVIDIA Triton Inference Server?

The most critical vulnerabilities are CVE-2025-23310 and CVE-2025-23311, both with CVSS scores of 9.8. These are stack buffer overflow vulnerabilities affecting Windows and Linux that can lead to remote code execution, denial of service, information disclosure, and data tampering through specially crafted inputs or HTTP requests.

How does the NVIDIA Triton vulnerability chain attack work?

The attack works in three steps: 1) Information Disclosure - attackers send oversized requests to leak internal memory region names, 2) Shared Memory API Abuse - using leaked names to gain unauthorized access to internal memory, and 3) Remote Code Execution - corrupting data structures and crafting malicious messages to achieve full system compromise.

Which CVEs are part of the NVIDIA Triton vulnerability chain?

The vulnerability chain consists of CVE-2025-23319 (CVSS 8.1) for Python backend out-of-bounds write, CVE-2025-23320 (CVSS 7.5) for shared memory limit bypass leading to information disclosure, and CVE-2025-23334 (CVSS 5.9) for Python backend out-of-bounds read vulnerability.

What are the potential impacts of the NVIDIA Triton vulnerabilities?

The vulnerabilities can lead to AI model theft, data breaches, response manipulation, network lateral movement, remote code execution, denial of service, information disclosure, and data tampering. Attackers can achieve complete control over vulnerable servers and compromise valuable AI models and sensitive data.

Which versions of NVIDIA Triton Inference Server are affected?

Different vulnerabilities affect different version ranges: some affect all versions prior to 25.05, others affect versions prior to 25.06, and the most critical vulnerabilities (including the chain attack components) affect all versions prior to 25.07.

What makes the NVIDIA Triton vulnerability chain particularly dangerous?

The vulnerability chain is particularly dangerous because it allows remote, unauthenticated attackers to achieve complete system compromise by chaining together seemingly minor flaws. The attack exploits legitimate API functionality and can bypass typical security measures, making it difficult to detect and prevent.

Advisory

Vulnerability chain in NVIDIA Triton Inference Server enables complete AI server takeover

Q: What is NVIDIA Triton Inference Server and how widely is it used?

NVIDIA Triton Inference Server is a widely-used open-source platform for deploying artificial intelligence models. It is used by over 25,000 companies globally, including major enterprises such as Microsoft, Amazon, Oracle, Siemens, and American Express for optimizing AI model deployment.

published: Aug. 4, 2025

Take action: If you're using NVIDIA Triton Inference Server for AI model deployment, plan a quick upgrade to version 25.07 or newer. In the meantime, make sure it's isolated from the internet and accessible only from trusted networks.

Learn More

NVIDIA has patched a series of security vulnerabilities in its Triton Inference Server, a widely-used open-source platform for deploying artificial intelligence models. Some of the vulnerabilities can be chained together and enable remote, unauthenticated attackers to achieve complete control over vulnerable servers, potentially compromising valuable AI models and sensitive data processed by these systems.

NVIDIA Triton Inference Server is used by over 25,000 companies globally, including major enterprises such as Microsoft, Amazon, Oracle, Siemens, and American Express for optimizing AI model deployment.

Vulnerability Chain Attack

The vulnerability chain was discovered by security researchers at Wiz, who identified how seemingly minor flaws can be combined to create a complete exploit. Chained vulnerabilities are:

CVE-2025-23319 (CVSS score 8.1) - Another Python backend out-of-bounds write vulnerability exploitable through malicious requests, central to the vulnerability chain discovered by Wiz Research.
CVE-2025-23320 (CVSS score 7.5) - A Python backend vulnerability where attackers can exceed shared memory limits through oversized requests, leading to information disclosure.
CVE-2025-23334 (CVSS score 5.9) - Another Python backend out-of-bounds read vulnerability, part of the Wiz Research vulnerability chain.

Step 1: Information Disclosure - Attackers send crafted, oversized requests to the Python backend, triggering an exception that causes error messages to leak the full name of the backend's internal IPC shared memory region (e.g., "triton_python_backend_shm_region_4f50c226-b3d0-46e8-ac59-d4690b28b859").

Step 2: Shared Memory API Abuse - Using the leaked memory name, attackers exploit Triton's legitimate shared memory API, which lacks validation to distinguish between user-owned and internal memory regions, allowing them to register the internal shared memory key and gain read/write access to the Python backend's private memory space.

Step 3: Remote Code Execution - With memory access established, attackers can corrupt data structures, manipulate pointers for out-of-bounds memory access, and craft malicious IPC messages to achieve full system compromise, potentially leading to AI model theft, data breaches, response manipulation, and network lateral movement.

NVidia has patched these along with a major update

CVE-2025-23310 (CVSS score 9.8) - A stack buffer overflow vulnerability affecting NVIDIA Triton Inference Server for Windows and Linux, where attackers can cause overflow through specially crafted inputs, potentially leading to remote code execution, denial of service, information disclosure, and data tampering.
CVE-2025-23311 (CVSS score 9.8) - Another stack overflow vulnerability where attackers can exploit the system through specially crafted HTTP requests, with similar impacts including remote code execution, denial of service, information disclosure, and data tampering.
CVE-2025-23317 (CVSS score 9.1) - A vulnerability in the HTTP server component allowing attackers to start a reverse shell by sending specially crafted HTTP requests, potentially resulting in remote code execution, denial of service, data tampering, and information disclosure.
CVE-2025-23318 (CVSS score 8.1) - A Python backend vulnerability causing out-of-bounds write conditions that can lead to code execution, denial of service, data tampering, and information disclosure.
CVE-2025-23319 (CVSS score 8.1) - Another Python backend out-of-bounds write vulnerability exploitable through malicious requests, central to the vulnerability chain discovered by Wiz Research.
CVE-2025-23320 (CVSS score 7.5) - A Python backend vulnerability where attackers can exceed shared memory limits through oversized requests, leading to information disclosure.
CVE-2025-23321 (CVSS score 7.5) - A divide-by-zero vulnerability triggered by invalid requests, causing denial of service.
CVE-2025-23322 (CVSS score 7.5) - A double-free vulnerability occurring when streams are cancelled before processing, resulting in denial of service.
CVE-2025-23323 (CVSS score 7.5) - An integer overflow vulnerability causing segmentation faults through invalid requests.
CVE-2025-23324 (CVSS score 7.5) - Another integer overflow vulnerability with similar impacts to CVE-2025-23323.
CVE-2025-23325 (CVSS score 7.5) - An uncontrolled recursion vulnerability exploitable through specially crafted inputs.
CVE-2025-23326 (CVSS score 7.5) - An integer overflow vulnerability causing denial of service through crafted inputs.
CVE-2025-23327 (CVSS score 7.5) - An integer overflow vulnerability that can lead to both denial of service and data tampering.
CVE-2025-23331 (CVSS score 7.5) - A memory allocation vulnerability causing segmentation faults through invalid requests.
CVE-2025-23333 (CVSS score 5.9) - A Python backend out-of-bounds read vulnerability through shared memory manipulation.
CVE-2025-23334 (CVSS score 5.9) - Another Python backend out-of-bounds read vulnerability, part of the Wiz Research vulnerability chain.
CVE-2025-23335 (CVSS score 4.4) - A TensorRT backend underflow vulnerability affecting specific model configurations.

Affected Systems and Versions

CVE-2025-23323, CVE-2025-23324, CVE-2025-23325, CVE-2025-23326, CVE-2025-23327, and CVE-2025-23335 affect all versions prior to 25.05.
CVE-2025-23322 and CVE-2025-23331 impact all versions prior to 25.06.
CVE-2025-23310, CVE-2025-23311, CVE-2025-23317, CVE-2025-23318, CVE-2025-23319, CVE-2025-23320, CVE-2025-23321, CVE-2025-23333, and CVE-2025-23334, affect all versions prior to 25.07.

Users should upgrade to NVIDIA Triton Inference Server version 25.07 or newer. Organizations should also follow the Secure Deployment Considerations Guide and ensure that logging and shared memory APIs are protected for use by authorized users only.

Vulnerability chain in NVIDIA Triton Inference Server enables complete AI server takeover

Read More
FreePBX Servers under active zero-day attack
IBM patches two critical flaws in AIX, urges …
Solarwinds releases new version of Self-Hosted Platform, patches …
Preboot Execution Environment vulnerabilities dubbed PixieFail expose risks …
Adobe releases February 2025 patches for multiple products