]> git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/commit
mla : make the V tensor a view of K (llama/18986)
authorGeorgi Gerganov <redacted>
Thu, 22 Jan 2026 20:09:01 +0000 (22:09 +0200)
committerGeorgi Gerganov <redacted>
Fri, 30 Jan 2026 13:56:40 +0000 (15:56 +0200)
commit3f96a1da0e89e538f508fb641422f5de83bb4dc4
tree41ee13819ee4637c0af77054d21e7d26a1669689
parentf21d0cbb1ac60ecbecba7a7e4294d812c6ab0fa2
mla : make the V tensor a view of K (llama/18986)

* mla : pass V as a view of K to the FA op

* cuda : adjust mla logic to new layout

* kv-cache : fix rope shift

* tests : remove comment

* cuda : fix reusable_cutoff

Co-authored-by: Johannes Gäßler <redacted>
---------

Co-authored-by: Johannes Gäßler <redacted>
ggml/src/ggml-cuda/fattn-common.cuh
ggml/src/ggml-cuda/fattn-mma-f16.cuh