]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
vulkan: subgroup size tuning (llama/12087)
authorDaniele <redacted>
Mon, 17 Mar 2025 11:42:33 +0000 (12:42 +0100)
committerGeorgi Gerganov <redacted>
Thu, 27 Mar 2025 07:35:24 +0000 (09:35 +0200)
commitc0c912466350b18a1e489e3e3b54f9529f4b686f
tree62fb5ed2488f5df8ec3529813834a9a1c318db02
parente7816d6243081ad491d9d69dcdc72167def38e4e
vulkan: subgroup size tuning (llama/12087)

* vulkan: subgroup size test

* Vulkan: Add device architecture enum and logic to recognize AMD generations

* vulkan: use new architecture logic to specify subgroup size

* Initial vulkan subgroup size tuning for RDNA3

* vulkan: commonize RDNA subgroup tuning

* vulkan: override subgroup size if required_subgroup_size = 0

* vulkan: disable warp 32 for RDNA3

* vulkan: fine tuned RDNA1 subgroup sizes

* vulkan: adjusted subgroup size map

* vulkan: fixed RDNA2 subgroup map

---------

Co-authored-by: 0cc4m <redacted>
src/ggml-vulkan/ggml-vulkan.cpp