]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
mla : make the V tensor a view of K (llama/18986)
authorGeorgi Gerganov <redacted>
Thu, 22 Jan 2026 20:09:01 +0000 (22:09 +0200)
committerGeorgi Gerganov <redacted>
Fri, 30 Jan 2026 11:49:29 +0000 (13:49 +0200)
commit91d417c5954c1cb45d83795f23c8b458730731c1
tree30bc77786e23a52bee9d9df3c9236eedaa0eacc7
parent544f15d98337dd2b5d5633af88dacdd28ce2d46a
mla : make the V tensor a view of K (llama/18986)

* mla : pass V as a view of K to the FA op

* cuda : adjust mla logic to new layout

* kv-cache : fix rope shift

* tests : remove comment

* cuda : fix reusable_cutoff

Co-authored-by: Johannes Gäßler <redacted>
---------

Co-authored-by: Johannes Gäßler <redacted>
src/ggml-cuda/fattn-common.cuh
src/ggml-cuda/fattn-mma-f16.cuh
tests/test-backend-ops.cpp