* context: zero output buffer on allocation
Address GHSA-wqq9-25mr-rw76.
The logits output buffer allocated in output_reserve() uses
posix_memalign(), which does not zero memory. The buffer is only
written during decode when needs_raw_logits() returns true. When
backend samplers cover all output sequences, needs_raw_logits()
returns false and the buffer is never written, but
llama_get_logits() still returns a pointer to it, exposing stale
heap content.
Zero the buffer after allocation to prevent information disclosure
through the public logits API.
Found-by: Pwno
* Update src/llama-context.cpp
Co-authored-by: Georgi Gerganov <redacted>
---------
Co-authored-by: Georgi Gerganov <redacted>
LLAMA_LOG_ERROR("%s: failed to allocate output buffer of size %.2f MiB\n", __func__, new_size / (1024.0 * 1024.0));
return 0;
}
+ ggml_backend_buffer_clear(buf_output.get(), 0);
}
float * output_base = (float *) ggml_backend_buffer_get_base(buf_output.get());