SGLang CVE-2026-5760 (CVSS 9.8) Permits RCE by way of Malicious GGUF Mannequin Recordsdata

Ravie LakshmananApr 20, 2026Open Supply / Server Safety

A essential safety vulnerability has been disclosed in SGLang that, if efficiently exploited, may end in distant code execution on vulnerable methods.

The vulnerability, tracked as CVE-2026-5760, carries a CVSS rating of 9.8 out of 10.0. It has been described as a case of command injection resulting in the execution of arbitrary code.

SGLang is a high-performance, open-source serving framework for giant language fashions and multimodal fashions. The official GitHub undertaking has been forked over 5,500 occasions and starred 26,100 occasions.

In keeping with the CERT Coordination Middle (CERT/CC), the vulnerability impacts the reranking endpoint “/v1/rerank,” permitting an attacker to attain arbitrary code execution within the context of the SGLang service via a specifically crafted GPT-Generated Unified Format (GGUF) mannequin file.

“An attacker exploits this vulnerability by making a malicious GPT Generated Unified Format (GGUF) mannequin file with a crafted tokenizer.chat_template parameter that accommodates a Jinja2 server-side template injection (SSTI) payload with a set off phrase to activate the weak code path,” CERT/CC stated in an advisory launched right now.

“The sufferer then downloads and masses the mannequin in SGLang, and when a request hits the “/v1/rerank” endpoint, the malicious template is rendered, executing the attacker’s arbitrary Python code on the server. This sequence of occasions permits the attacker to attain distant code execution (RCE) on the SGLang server.”

Per safety researcher Stuart Beck, who found and reported the flaw, the underlying subject stems from the usage of jinja2.Setting() with out sandboxing as a substitute of ImmutableSandboxedEnvironment. This, in flip, permits a malicious mannequin to execute arbitrary Python code on the inference server.

Your entire sequence of actions is as follows –

An attacker creates a GGUF mannequin file with a malicious tokenizer.chat_template containing a Jinja2 SSTI payload
The template contains the Qwen3 reranker set off phrase to activate the weak code path in “entrypoints/openai/serving_rerank.py”
Sufferer downloads and masses the mannequin in SGLang from sources like Hugging Face
When a request hits the “/v1/rerank” endpoint, SGLang reads the chat_template and renders it with jinja2.Setting()
The SSTI payload executes arbitrary Python code on the server

It is price noting that CVE-2026-5760 falls below the identical vulnerability class as CVE-2024-34359 (aka Llama Drama, CVSS rating: 9.7), a now-patched essential flaw within the llama_cpp_python Python package deal that might have resulted in arbitrary code execution. The identical assault floor was additionally rectified in vLLM late final yr (CVE-2025-61620, CVSS rating: 6.5).

“To mitigate this vulnerability, it is suggested to make use of ImmutableSandboxedEnvironment as a substitute of jinja2.Setting() to render the chat templates,” CERT/CC stated. “This can forestall the execution of arbitrary Python code on the server. No response or patch was obtained through the coordination course of.”

Source link