git.djapps.eu Git - pkg/ggml/sources/ggml/commit

]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit

overview / pkg / ggml / sources / ggml / commit

summary | shortlog | log | commit | commitdiff | tree
(parent: d206e72)

author	Jeff Bolz <redacted>
	Wed, 9 Apr 2025 05:12:57 +0000 (00:12 -0500)
committer	Georgi Gerganov <redacted>
	Thu, 10 Apr 2025 20:58:06 +0000 (23:58 +0300)
commit	dcaef71e941d4e9772319f7f11e89a0900eb5a78
tree	349a9e61f1fcaeb0a0b0520d239d9e3b751bfe71	tree
parent	d206e725693e8b06f7e5af33b6fe2bb5aa25e1ef	commit \| diff

vulkan: Use fp16 for the flash attention P*V multiplication (llama/12783)

This is consistent with the ggml-cuda behavior and the mul_mat fallback.

src/ggml-vulkan/vulkan-shaders/flash_attn_cm2.comp

diff | blob | history

Packaging of ggml-org/ggml