]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
mla : make the V tensor a view of K (#18986)
authorGeorgi Gerganov <redacted>
Thu, 22 Jan 2026 20:09:01 +0000 (22:09 +0200)
committerGitHub <redacted>
Thu, 22 Jan 2026 20:09:01 +0000 (22:09 +0200)
commita5eaa1d6a3732bc0f460b02b61c95680bba5a012
treeeb60495a5339d1bc6473c900299039d295125ce7
parente2baf02162382a14c9f4fc15d7681a715256453c
mla : make the V tensor a view of K (#18986)

* mla : pass V as a view of K to the FA op

* cuda : adjust mla logic to new layout

* kv-cache : fix rope shift

* tests : remove comment

* cuda : fix reusable_cutoff

Co-authored-by: Johannes Gäßler <redacted>
---------

Co-authored-by: Johannes Gäßler <redacted>
ggml/src/ggml-cuda/fattn-common.cuh
ggml/src/ggml-cuda/fattn-mma-f16.cuh
src/llama-graph.cpp
src/llama-kv-cache.cpp
src/models/deepseek2.cpp
src/models/minicpm3.cpp
src/models/plm.cpp
tests/test-backend-ops.cpp