A critical vulnerability, dubbed "Bleeding Llama" (CVE-2026-7482), has been discovered in **Ollama**, a popular tool for running large language models locally. This vulnerability allows **unauthenticated attackers to leak the entire memory of the Ollama process** 【1】.
### What Happened
The vulnerability is an **out-of-bounds heap read** 【1】. Ollama processes GGUF files, which contain shape metadata. An attacker can craft a GGUF file with manipulated shape metadata, causing Ollama to read beyond the intended buffer into adjacent heap memory. This sensitive data is then written to disk as part of a new model file 【1】. The use of Go's `unsafe` package, which bypasses standard memory safety guarantees, is the root cause of this vulnerability 【1】.
### Who is Affected
The vulnerability potentially impacts an estimated **300,000 servers globally** that are running Ollama 【1】. Ollama, by default, listens on all interfaces (`0.0.0.0`) without authentication, making these servers accessible to attackers 【1】.
### Security Implications
The leaked memory can contain highly sensitive information, including:
- **User messages (prompts)** 【1】
- **System prompts** 【1】
- **Environment variables** from the host machine 【1】
In enterprise environments, this could lead to the exfiltration of **API keys, proprietary code, customer contracts**, and other confidential data 【1】. If Ollama is integrated with tools like Claude Code, the impact is even greater as all tool outputs are processed and potentially leaked 【1】. Attackers can exfiltrate this data by using the `/api/push` endpoint, directing the model file containing the leaked data to an attacker-controlled server 【1】. The vulnerability has a **CVSS score of 9.1**, indicating a critical severity 【1】.
### Technical Details
- **Vulnerability Type:** Out-of-bounds heap read 【1】.
- **Root Cause:** Exploitation of Go's `unsafe` package, allowing low-level memory operations 【1】.
- **Exploitation Method:**
1. An attacker crafts a GGUF file with a manipulated shape field.
2. Ollama parses this file and reads beyond the intended buffer, capturing sensitive data from the heap.
3. To preserve the leaked data, the attacker sets the tensor type to F16 and requests F32 as the target format, ensuring a lossless conversion 【1】.
4. The sensitive data is written to disk as part of a new model file.
5. The attacker uses the `/api/push` endpoint with a crafted model name (an attacker-controlled URI) to upload the compromised model file to their server 【1】.
- **Impact:** Complete leakage of Ollama process memory, including sensitive user and system data 【1】.
### What Defenders Should Know
- **Immediate Action Required:** Organizations using Ollama should **immediately update to a patched version** or take steps to mitigate the vulnerability.
- **Default Configuration Risk:** Be aware that Ollama's default configuration listens on all interfaces without authentication, exposing a significant number of installations 【1】.
- **Network Segmentation:** Consider restricting network access to Ollama instances, especially if they are not properly secured.
- **Vulnerability Disclosure Timeline:** The vulnerability was reported on February 2, 2026, and a fix was shared by Ollama on February 25, 2026. The CVE (CVE-2026-7482) was published on May 1, 2026 【1】. Ensure you are running a version that incorporates this fix.
- **Data Sensitivity:** Understand the sensitive data that Ollama processes and the potential impact of its leakage, particularly in enterprise settings 【1】.