]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
ggml-cpu: FA split across kv for faster TG (#19209)
authorAman Gupta <redacted>
Mon, 2 Feb 2026 17:19:55 +0000 (01:19 +0800)
committerGitHub <redacted>
Mon, 2 Feb 2026 17:19:55 +0000 (01:19 +0800)
commit9f682fb640765ff79ee13a7a00cdbaa15c1ed07a
tree5c429c305a4d27c26f68ce0c14e0b5275eee4c18
parenta3fa03582240a4279ba019a3db2bb87311d5d485
ggml-cpu: FA split across kv for faster TG (#19209)

* ggml-cpu: split across kv for faster TG

* simplify sinks application

* add ref impl
ggml/include/ggml-cpu.h
ggml/src/ggml-cpu/ggml-cpu-impl.h
ggml/src/ggml-cpu/ggml-cpu.c
ggml/src/ggml-cpu/ggml-cpu.cpp
ggml/src/ggml-cpu/ops.cpp
tests/test-backend-ops.cpp