git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

overview / pkg / ggml / sources / llama.cpp / commit

author	Tom Hillbrunner <redacted>
	Sat, 21 Mar 2026 17:35:00 +0000 (18:35 +0100)
committer	GitHub <redacted>
	Sat, 21 Mar 2026 17:35:00 +0000 (19:35 +0200)
commit	212f4521b013a3eeb79e15df7ca07a5329d39d4b
tree	81e9bc5790a292fd52b763bedcd96ff0f2e30662	tree
parent	568aec82d2fc48341c54cae565768ac75072a31d	commit \| diff

context : use n_embd_out for pooled embedding extraction (#20840)

The MEAN/CLS/LAST pooling paths in encode() and decode() used
n_embd_inp() (16384 for qwen3vl with deepstack) to read from the
pooled embedding tensor, which only has n_embd_out() (4096) floats
per sequence. This caused a tensor read out of bounds assertion.

Fixes embedding mode for Qwen3-VL-Embedding models.

src/llama-context.cpp

diff | blob | history

Packaging of ggml-org/llama.cpp

RSS Atom