git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

overview / pkg / ggml / sources / llama.cpp / commit

summary | shortlog | log | commit | commitdiff | tree
(parent: 7538246)

author	Jeff Bolz <redacted>
	Wed, 9 Apr 2025 05:12:57 +0000 (00:12 -0500)
committer	GitHub <redacted>
	Wed, 9 Apr 2025 05:12:57 +0000 (07:12 +0200)
commit	7ecd780b1a1d5214b8d04c25ebfc194d310816ed
tree	488a39949a744d4d34aba433be30b69b70caa3fa	tree
parent	7538246e7ce0606694c38055cc2fc9f60535be6c	commit \| diff

vulkan: Use fp16 for the flash attention P*V multiplication (#12783)

This is consistent with the ggml-cuda behavior and the mul_mat fallback.

ggml/src/ggml-vulkan/vulkan-shaders/flash_attn_cm2.comp

diff | blob | history

Packaging of ggml-org/llama.cpp