Advisory

Nvidia releases patches for DGX A100 system

Take action: If you are fortunate enough to run a DGX A100, you should take good care of it. The Baseboard management and KVM interfaces should be visible only to trusted networks, not the entire internet. Because it's very probable that a machine of that size is very busy with processing or training AI, first lock down the interfaces to trusted networks. Then announce the need to patch and plan with the organization for downtime. Not easy, but wise to patch. Because the system is way too valuable to be compromised.


Learn More

Nvidia has announced the release of patches for eleven firmware vulnerabilities, three of which have critical severity scores. These vulnerabilities impact the baseboard management controller (BMC) of Nvidia's DGX A100 systems, particularly in the keyboard, video, and mouse (KVM) daemon. The Nvidia DGX A100 system is a 6U server system designed for high intensity GPU processing. It can accept up to 8 Nvidia A100 GPUs.

Impacted versions are all BMC versions prior to 00.22.05 and all SBOIS versions prior to 1.25.

The three critical vulnerabilities expose risk of code execution, denial of service, information disclosure and data tampering.

  • CVE-2023-31029 (CVSS score 9.3) - NVIDIA DGX A100 baseboard management controller (BMC) contains a vulnerability in the host KVM daemon, where an unauthenticated attacker may cause a stack overflow by sending a specially crafted network packet.
  • CVE-2023-31030 (CVSS score 9.3) - NVIDIA DGX A100 BMC contains a vulnerability in the host KVM daemon, where an unauthenticated attacker may cause a stack overflow by sending a specially crafted network packet.
  • CVE-2023-31024 (CVSS score 9.0)-  NVIDIA DGX A100 BMC contains a vulnerability in the host KVM daemon, where an unauthenticated attacker may cause stack memory corruption by sending a specially crafted network packet.

In addition to the critical issues, Nvidia has also disclosed two high-severity vulnerabilities, identified as CVE-2023-25529 and CVE-2023-25530, in the KVM service of both DGX H100 and DGX A100 models. These vulnerabilities are linked to a possible leak of session tokens and an error in input validation.

Nvidia releases patches for DGX A100 system