Advisory

Critical Path Traversal Flaw in Unstructured.io AI Library Enables Remote Code Execution

Take action: If you are processing mail attachments throuh AI, this is an important advisory. Check if you directly use Unstructured.io or update the systems that import and use this library. If you cannot update right away, disable attachment processing in your code and implement controls to sanitize filename attachments.


Learn More

Unstructured.io, an ETL library used for AI data processing a significant number of Fortune 1000 companies, has patched a critical security flaw. The library transforms unstructured data like PDFs, emails, and images into AI-ready formats for vector databases. 

The vulnerability is tracked as CVE-2025-64712 (CVSS score 9.8) - A path traversal vulnerability in the partition_msg function that allows attackers to write arbitrary files to the host system. The flaw exists because the AttachmentPartitioner.iter_elements component blindly concatenates the /tmp/ directory with unvalidated filenames from Microsoft Outlook .msg attachments. By crafting an attachment name like ../../etc/passwd, an attacker can escape the temporary directory and overwrite critical system files. 

Because the library is integrated into popular frameworks like LangChain and LlamaIndex, the vulnerability creates a supply chain risk for millions of AI deployments, including those referenced in AWS, Azure, and GCP production documentation.

The flaw allows attackers to overwrite SSH authorized_keys, cron jobs, or Python packages to gain persistent access to the underlying server. Since Unstructured.io processes the vast majority of enterprise data, a compromise could lead to massive data exfiltration or lateral movement within corporate networks. The nested nature of this dependency in approximately 100,000 GitHub files via LangChain makes tracking and patching the flaw difficult for many security teams who may not realize they are running the vulnerable code.

This vulnerability affects all versions of the Unstructured library up to and including version 0.18.17. The flaw is active when the process_attachments parameter is set to True within the partition_msg function, which is invoked by MsgPartitioner.iter_message_elements to handle email elements. 

Organizations using the open-source library on GitHub or integrating it via managed SaaS APIs are equally at risk if they process untrusted email files.

Organizations must upgrade to Unstructured version 0.18.18 immediately to resolve the path traversal issue. If an immediate upgrade is not possible, administrators should set process_attachments=False when handling untrusted .msg files or implement strict filename validation before processing.

Critical Path Traversal Flaw in Unstructured.io AI Library Enables Remote Code Execution