]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
ggml-cpu: FA split across kv for faster TG (llama/19209)
authorAman Gupta <redacted>
Mon, 2 Feb 2026 17:19:55 +0000 (01:19 +0800)
committerGeorgi Gerganov <redacted>
Sat, 7 Feb 2026 08:37:38 +0000 (10:37 +0200)
commit218a005716951e6dffe800dc310e41024eeda779
tree1534aba61f2aec219c93d0cfb752b883e9374c28
parent4c3c93f7185daebe8747d3e8b0728a4494d4bcc9
ggml-cpu: FA split across kv for faster TG (llama/19209)

* ggml-cpu: split across kv for faster TG

* simplify sinks application

* add ref impl
include/ggml-cpu.h
src/ggml-cpu/ggml-cpu-impl.h
src/ggml-cpu/ggml-cpu.c
src/ggml-cpu/ggml-cpu.cpp
src/ggml-cpu/ops.cpp
tests/test-backend-ops.cpp