git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

overview / pkg / ggml / sources / llama.cpp / commit

author	Max Krasnyansky <redacted>
	Thu, 30 Oct 2025 12:26:05 +0000 (05:26 -0700)
committer	GitHub <redacted>
	Thu, 30 Oct 2025 12:26:05 +0000 (14:26 +0200)
commit	dcca0d3ab840ebe9b2ccd4719033d408eeb758d7
tree	14e1b91a4495c4f7044065fffd9c899f52c7f990	tree
parent	bacddc049a00786df44e682262f6e298742bfbc3	commit \| diff

cpu: introduce chunking for flash attention (#16829)

Factor out the core FA loop into flash_atten_f16_one_chunk and add an outter loop
on top that handles the chunks.

ggml/src/ggml-cpu/ops.cpp

diff | blob | history

Packaging of ggml-org/llama.cpp

RSS Atom